AI Glossary

Every AI term,
explained simply.

312+ definitions across machine learning, deep learning, NLP, generative AI, computer vision, and more.

A30 terms

A/B Testing

A/B testing is an experimental method that compares two versions of a model, prompt, or interface to determine which performs better. In AI, A/B testing helps evaluate model outputs, UI changes, and prompt strategies by measuring user engagement or accuracy.

Data Science

Abstractive Summarization

Abstractive summarization generates new text that captures the key points of a longer document, rather than simply extracting existing sentences. It requires deep language understanding and generation capabilities.

Natural Language Processing

Accuracy

Accuracy is a metric that measures the proportion of correct predictions out of total predictions made by a model. While intuitive, accuracy can be misleading on imbalanced datasets where one class dominates.

Machine Learning

Activation Function

An activation function is a mathematical function applied to a neuron's output to introduce non-linearity into a neural network. Common activation functions include ReLU, sigmoid, and tanh, each with different properties for gradient flow.

Deep Learning

Active Learning

Active learning is a machine learning approach where the model selectively queries an oracle (often a human) for labels on the most informative data points. This reduces the total amount of labeled data needed to train an accurate model.

Machine Learning

Adam Optimizer

Adam (Adaptive Moment Estimation) is an optimization algorithm that combines the benefits of AdaGrad and RMSProp. It adapts learning rates for each parameter using estimates of first and second moments of gradients.

Deep Learning

Adapter Layers

Adapter layers are small trainable modules inserted into a pre-trained model to enable parameter-efficient fine-tuning. They allow task adaptation while keeping the original model weights frozen.

Deep Learning

Adversarial Attack

An adversarial attack is a technique that creates deliberately crafted inputs designed to fool a machine learning model into making incorrect predictions. These attacks reveal vulnerabilities in AI systems and are critical to AI safety research.

AI Ethics & Safety

Adversarial Training

Adversarial training is a defense strategy that improves model robustness by including adversarial examples in the training data. The model learns to correctly classify both normal and adversarially perturbed inputs.

AI Ethics & Safety

Agent

An AI agent is an autonomous system that perceives its environment, makes decisions, and takes actions to achieve specific goals. Modern AI agents can use tools, browse the web, write code, and chain multiple reasoning steps together.

AI Applications

Agentic AI

Agentic AI refers to AI systems that can autonomously plan, reason, and execute multi-step tasks with minimal human oversight. These systems use tool calling, memory, and iterative problem-solving to accomplish complex goals.

AI Applications

AGI

Artificial General Intelligence (AGI) refers to a hypothetical AI system with human-level cognitive abilities across all intellectual tasks. Unlike narrow AI, AGI would be able to learn, reason, and solve problems in any domain without task-specific training.

Fundamentals

AI Alignment

AI alignment is the research field focused on ensuring that AI systems pursue goals and behaviors consistent with human values and intentions. Alignment is considered one of the most important challenges in AI safety.

AI Ethics & Safety

AI Chip

An AI chip is a specialized processor designed specifically for artificial intelligence workloads like neural network training and inference. Examples include NVIDIA's GPUs, Google's TPUs, and custom ASICs.

AI Infrastructure

AI Ethics

AI ethics is the branch of ethics that examines the moral implications of developing and deploying artificial intelligence systems. It addresses fairness, transparency, privacy, accountability, and the societal impact of AI technology.

AI Ethics & Safety

AI Safety

AI safety is the interdisciplinary field focused on ensuring AI systems operate reliably, beneficially, and without causing unintended harm. It encompasses alignment, robustness, interpretability, and governance research.

AI Ethics & Safety

AI Visibility

AI visibility refers to how prominently a brand, product, or entity appears in AI-generated responses from systems like ChatGPT, Perplexity, and Gemini. As AI-powered search grows, visibility in AI recommendations becomes a critical marketing metric.

AI Applications

AI Winter

An AI winter is a period of reduced funding, interest, and research progress in artificial intelligence. Historical AI winters occurred in the 1970s and late 1980s, often following inflated expectations and undelivered promises.

Fundamentals

Algorithm

An algorithm is a step-by-step procedure or set of rules for solving a computational problem. In AI, algorithms define how models learn from data, make predictions, and optimize their performance.

Fundamentals

Annotation

Annotation is the process of adding labels or metadata to raw data to create training datasets for supervised learning. Data annotation can involve labeling images, tagging text, or marking audio segments.

Data Science

Anomaly Detection

Anomaly detection is the identification of data points, events, or patterns that deviate significantly from expected behavior. AI-based anomaly detection is used in fraud prevention, cybersecurity, and industrial monitoring.

Machine Learning

API

An API (Application Programming Interface) is a set of protocols and tools that allows different software systems to communicate. AI APIs enable developers to integrate machine learning capabilities like text generation, image recognition, and speech processing into applications.

AI Infrastructure

Artificial General Intelligence

Artificial General Intelligence is a theoretical form of AI that would match or exceed human cognitive abilities across all domains. AGI remains an aspirational goal rather than a current reality in AI research.

Fundamentals

Artificial Intelligence

Artificial Intelligence is the broad field of computer science focused on creating systems that can perform tasks requiring human-like intelligence. AI encompasses machine learning, natural language processing, computer vision, and robotics.

Fundamentals

Artificial Narrow Intelligence

Artificial Narrow Intelligence (ANI) refers to AI systems designed to perform specific tasks, such as image recognition or language translation. All current AI systems, including large language models, are forms of narrow intelligence.

Fundamentals

Artificial Superintelligence

Artificial Superintelligence (ASI) is a hypothetical AI that would surpass human intelligence in every cognitive dimension. The prospect of ASI raises profound questions about control, alignment, and the future of humanity.

Fundamentals

Attention Mechanism

An attention mechanism allows neural networks to focus on the most relevant parts of the input when producing each element of the output. Attention is the foundational innovation behind the Transformer architecture and modern large language models.

Deep Learning

Autoencoder

An autoencoder is a neural network trained to compress input data into a compact representation and then reconstruct it. Autoencoders are used for dimensionality reduction, denoising, and learning latent representations.

Deep Learning

AutoML

Automated Machine Learning (AutoML) is the process of automating the end-to-end pipeline of applying machine learning, including feature engineering, model selection, and hyperparameter tuning. AutoML democratizes AI by reducing the expertise required.

Machine Learning

Autonomous Systems

Autonomous systems are AI-powered machines that can operate and make decisions independently without continuous human supervision. Examples include self-driving cars, delivery drones, and robotic warehouse systems.

Robotics & Automation
B16 terms

Backpropagation

Backpropagation is the algorithm used to train neural networks by computing gradients of the loss function with respect to each weight. It propagates error signals backward through the network to update weights and minimize prediction errors.

Deep Learning

Bagging

Bagging (Bootstrap Aggregating) is an ensemble technique that trains multiple models on random subsets of training data and combines their predictions. Random Forest is the most well-known bagging-based algorithm.

Machine Learning

Batch Normalization

Batch normalization is a technique that normalizes layer inputs across mini-batches during training to stabilize and accelerate neural network training. It reduces internal covariate shift and allows higher learning rates.

Deep Learning

Batch Size

Batch size is the number of training examples used in one iteration of gradient descent. Larger batches provide more stable gradient estimates but require more memory, while smaller batches add beneficial noise.

Deep Learning

Bayesian Network

A Bayesian network is a probabilistic graphical model that represents variables and their conditional dependencies using a directed acyclic graph. It enables reasoning under uncertainty and causal inference.

Machine Learning

Beam Search

Beam search is a decoding algorithm that explores multiple candidate sequences simultaneously, keeping only the top-k most promising at each step. It balances between greedy decoding and exhaustive search in text generation.

Natural Language Processing

Benchmark

A benchmark is a standardized test or dataset used to evaluate and compare the performance of different AI models. Common benchmarks include MMLU, HumanEval, and ImageNet.

Data Science

BERT

BERT (Bidirectional Encoder Representations from Transformers) is a language model developed by Google that reads text in both directions simultaneously. BERT revolutionized NLP by enabling deep bidirectional pre-training for language understanding tasks.

Natural Language Processing

Bias in AI

Bias in AI refers to systematic errors or unfair outcomes in machine learning models that arise from biased training data, flawed assumptions, or problematic design choices. Addressing AI bias is essential for building fair and equitable systems.

AI Ethics & Safety

Bias-Variance Tradeoff

The bias-variance tradeoff is the fundamental tension in machine learning between model simplicity (high bias) and model flexibility (high variance). Optimal models balance underfitting and overfitting to generalize well to new data.

Machine Learning

Bigram

A bigram is a contiguous sequence of two items (typically words or characters) from a given text. Bigram models estimate the probability of a word based on the immediately preceding word.

Natural Language Processing

Binary Classification

Binary classification is a supervised learning task where the model assigns inputs to one of exactly two categories. Spam detection (spam vs. not spam) and medical diagnosis (positive vs. negative) are common examples.

Machine Learning

Boltzmann Machine

A Boltzmann machine is a stochastic recurrent neural network that can learn a probability distribution over its inputs. Restricted Boltzmann Machines (RBMs) were influential in the deep learning revolution as building blocks for deep belief networks.

Deep Learning

Boosting

Boosting is an ensemble method that trains models sequentially, with each new model focusing on correcting the errors of previous ones. Popular boosting algorithms include AdaBoost, Gradient Boosting, and XGBoost.

Machine Learning

Bounding Box

A bounding box is a rectangular border drawn around an object in an image to indicate its location and extent. Bounding boxes are the primary output format for object detection models.

Computer Vision

Byte Pair Encoding

Byte Pair Encoding (BPE) is a subword tokenization algorithm that iteratively merges the most frequent pairs of characters or character sequences. BPE is widely used in modern language models to handle rare words and multilingual text.

Natural Language Processing
C20 terms

Catastrophic Forgetting

Catastrophic forgetting is the tendency of neural networks to abruptly lose previously learned knowledge when trained on new tasks. Continual learning research aims to overcome this limitation.

Deep Learning

Causal Inference

Causal inference is the process of determining cause-and-effect relationships from data, going beyond mere correlation. AI systems increasingly use causal reasoning to make more robust and interpretable decisions.

Data Science

Chain of Thought

Chain of thought is a prompting technique that encourages large language models to break down complex reasoning into intermediate steps. This approach significantly improves performance on math, logic, and multi-step reasoning tasks.

Generative AI

Chatbot

A chatbot is a software application that simulates human conversation through text or voice interactions. Modern AI chatbots use large language models to generate contextually relevant, natural-sounding responses.

AI Applications

ChatGPT

ChatGPT is an AI chatbot developed by OpenAI that uses large language models to generate human-like conversational responses. It became one of the fastest-growing consumer applications in history after its launch in November 2022.

Generative AI

Classification

Classification is a supervised learning task where the model predicts which category or class an input belongs to. Examples include email spam detection, image recognition, and sentiment analysis.

Machine Learning

Claude

Claude is an AI assistant developed by Anthropic, designed to be helpful, harmless, and honest. It is built using Constitutional AI techniques and competes with models like GPT-4 and Gemini.

Generative AI

Clustering

Clustering is an unsupervised learning technique that groups similar data points together without predefined labels. Common clustering algorithms include K-Means, DBSCAN, and hierarchical clustering.

Machine Learning

CNN

A CNN (Convolutional Neural Network) is a deep learning architecture designed to process grid-structured data like images. CNNs use convolutional filters to automatically learn spatial hierarchies of features.

Deep Learning

Computer Vision

Computer vision is a field of AI that enables machines to interpret and understand visual information from images and videos. Applications include facial recognition, autonomous driving, medical imaging, and augmented reality.

Computer Vision

Confusion Matrix

A confusion matrix is a table that summarizes a classification model's performance by showing true positives, true negatives, false positives, and false negatives. It provides a detailed breakdown beyond simple accuracy.

Machine Learning

Constitutional AI

Constitutional AI is an approach developed by Anthropic that trains AI systems to be helpful, harmless, and honest using a set of written principles. The model critiques and revises its own outputs based on these constitutional rules.

AI Ethics & Safety

Continual Learning

Continual learning is the ability of an AI system to learn new tasks or knowledge over time without forgetting previously learned information. It aims to create more human-like learning that accumulates knowledge incrementally.

Machine Learning

Contrastive Learning

Contrastive learning is a self-supervised technique that trains models to distinguish between similar and dissimilar data pairs. It learns useful representations by pulling similar examples closer and pushing dissimilar ones apart in embedding space.

Deep Learning

Convolutional Neural Network

A convolutional neural network is a specialized deep learning architecture that applies learned filters across input data to detect patterns. CNNs excel at image recognition, object detection, and visual understanding tasks.

Deep Learning

Corpus

A corpus is a large, structured collection of text documents used for training and evaluating natural language processing models. The quality and diversity of a training corpus significantly impacts model performance.

Natural Language Processing

Cross-Entropy

Cross-entropy is a loss function that measures the difference between two probability distributions — typically the model's predictions and the true labels. It is the standard loss function for classification tasks in deep learning.

Machine Learning

Cross-Validation

Cross-validation is a model evaluation technique that splits data into multiple folds, training and testing on different subsets in rotation. K-fold cross-validation provides more reliable performance estimates than a single train-test split.

Data Science

CUDA

CUDA (Compute Unified Device Architecture) is NVIDIA's parallel computing platform that allows developers to use GPUs for general-purpose processing. CUDA is the foundation of GPU-accelerated deep learning training.

AI Infrastructure

Curriculum Learning

Curriculum learning is a training strategy that presents examples to a model in a meaningful order, starting with easier examples and progressively introducing harder ones. This mimics human learning and can improve convergence.

Machine Learning
D19 terms

Data Augmentation

Data augmentation is a technique that artificially increases training dataset size by creating modified versions of existing data. In computer vision, this includes rotations, flips, and color changes; in NLP, it includes paraphrasing and synonym replacement.

Data Science

Data Drift

Data drift occurs when the statistical properties of production data change over time compared to the training data. Drift can degrade model performance and requires monitoring and retraining strategies to address.

Data Science

Data Labeling

Data labeling is the process of assigning meaningful tags or annotations to raw data to create supervised learning datasets. High-quality labeled data is essential for training accurate machine learning models.

Data Science

Data Lake

A data lake is a centralized storage repository that holds vast amounts of raw data in its native format. AI systems often draw training data from data lakes that store structured, semi-structured, and unstructured information.

AI Infrastructure

Data Pipeline

A data pipeline is an automated series of data processing steps that moves and transforms data from source systems to a destination. ML data pipelines handle ingestion, cleaning, feature engineering, and model training workflows.

AI Infrastructure

Data Warehouse

A data warehouse is a centralized repository for structured, processed data optimized for analysis and reporting. AI and ML systems often source their training data from enterprise data warehouses.

AI Infrastructure

Decision Boundary

A decision boundary is the surface or line that separates different classes in a classification model's feature space. The shape and complexity of decision boundaries depend on the model architecture and training data.

Machine Learning

Decision Tree

A decision tree is a supervised learning algorithm that makes predictions by learning a series of if-then rules from training data. Decision trees are interpretable and form the basis of powerful ensemble methods like Random Forest.

Machine Learning

Deep Learning

Deep learning is a subset of machine learning that uses artificial neural networks with multiple layers to learn hierarchical representations of data. Deep learning has achieved breakthrough results in vision, language, and speech.

Deep Learning

Deep Reinforcement Learning

Deep reinforcement learning combines deep neural networks with reinforcement learning algorithms to handle complex, high-dimensional environments. It has achieved superhuman performance in games like Go, chess, and Atari.

Reinforcement Learning

Deepfake

A deepfake is AI-generated synthetic media that convincingly replaces a person's likeness, voice, or actions in images, audio, or video. Deepfakes raise significant concerns about misinformation and identity fraud.

AI Ethics & Safety

Depthwise Separable Convolution

Depthwise separable convolution is an efficient convolution variant that factorizes a standard convolution into depthwise and pointwise operations. It dramatically reduces computation while maintaining accuracy, enabling mobile AI.

Deep Learning

Diffusion Model

A diffusion model is a generative AI model that creates data by learning to reverse a gradual noise-adding process. Diffusion models power state-of-the-art image generation systems like Stable Diffusion and DALL-E.

Generative AI

Dimensionality Reduction

Dimensionality reduction is the process of reducing the number of features in a dataset while preserving its essential structure. Techniques like PCA and t-SNE help with visualization, noise reduction, and computational efficiency.

Data Science

Discriminator

A discriminator is the component of a GAN that learns to distinguish between real and generated data. It provides feedback to the generator, creating an adversarial training dynamic that improves output quality.

Generative AI

Distillation

Knowledge distillation is a model compression technique where a smaller student model learns to replicate the behavior of a larger teacher model. Distillation makes it possible to deploy powerful AI in resource-constrained environments.

Deep Learning

Distributed Training

Distributed training is the practice of splitting model training across multiple GPUs or machines to handle large models and datasets. It uses data parallelism or model parallelism to accelerate training.

AI Infrastructure

Dropout

Dropout is a regularization technique that randomly deactivates a fraction of neurons during training to prevent overfitting. It forces the network to learn redundant representations and improves generalization.

Deep Learning

Dynamic Programming

Dynamic programming is an algorithmic technique that solves complex problems by breaking them into simpler overlapping subproblems. It is used in reinforcement learning, sequence alignment, and optimal control.

Fundamentals
E10 terms

Edge AI

Edge AI refers to running artificial intelligence algorithms locally on hardware devices rather than in the cloud. Edge AI enables real-time inference with lower latency, better privacy, and reduced bandwidth requirements.

AI Infrastructure

Embedding

An embedding is a dense vector representation that captures the semantic meaning of data like words, sentences, or images in a continuous mathematical space. Similar items are mapped to nearby points, enabling semantic search and comparison.

Deep Learning

Emergent Behavior

Emergent behavior refers to capabilities that appear in large AI models that were not explicitly trained for or predicted. Examples include in-context learning and chain-of-thought reasoning in large language models.

Fundamentals

Encoder-Decoder

An encoder-decoder is a neural network architecture where the encoder compresses input into a latent representation and the decoder generates output from it. This architecture is foundational for translation, summarization, and image captioning.

Deep Learning

Ensemble Learning

Ensemble learning combines multiple models to produce better predictions than any individual model alone. Techniques include bagging, boosting, and stacking, which reduce variance, bias, or both.

Machine Learning

Epoch

An epoch is one complete pass through the entire training dataset during model training. Training typically requires multiple epochs for the model to converge to good performance.

Machine Learning

Evaluation Metric

An evaluation metric is a quantitative measure used to assess model performance on a given task. Common metrics include accuracy, precision, recall, F1 score, AUC-ROC, and perplexity.

Machine Learning

Explainable AI

Explainable AI (XAI) encompasses techniques that make AI system decisions understandable to humans. XAI is crucial for building trust, meeting regulatory requirements, and debugging model behavior.

AI Ethics & Safety

Exploration vs Exploitation

Exploration vs exploitation is a fundamental dilemma in reinforcement learning between trying new actions to discover better rewards versus leveraging known good actions. Balancing both is key to optimal long-term performance.

Reinforcement Learning

Extractive Summarization

Extractive summarization selects and combines the most important sentences directly from a source document to create a summary. It preserves the original wording but may lack the coherence of abstractive approaches.

Natural Language Processing
F12 terms

F1 Score

The F1 score is the harmonic mean of precision and recall, providing a single metric that balances both. An F1 score of 1 indicates perfect precision and recall, while 0 indicates total failure.

Machine Learning

Face Recognition

Face recognition is a computer vision technology that identifies or verifies individuals by analyzing facial features in images or video. It is used in security systems, phone unlocking, and photo organization.

Computer Vision

Feature Engineering

Feature engineering is the process of creating, selecting, and transforming input variables to improve machine learning model performance. Good feature engineering often matters more than model choice for traditional ML tasks.

Data Science

Feature Extraction

Feature extraction is the process of automatically identifying and selecting the most informative representations from raw data. Deep learning models learn to extract features hierarchically, from simple edges to complex patterns.

Machine Learning

Feature Map

A feature map is the output of applying a convolutional filter to an input, representing the presence and location of detected features. Deeper layers produce feature maps capturing increasingly abstract patterns.

Deep Learning

Feature Store

A feature store is a centralized repository for storing, managing, and serving machine learning features. It enables feature reuse, consistency between training and serving, and collaboration across ML teams.

AI Infrastructure

Federated Learning

Federated learning is a machine learning approach where models are trained across decentralized devices without sharing raw data. It enables privacy-preserving AI by keeping data on local devices while aggregating model updates.

Machine Learning

Few-Shot Learning

Few-shot learning is the ability of a model to learn and generalize from only a small number of labeled examples. Large language models demonstrate impressive few-shot capabilities through in-context learning.

Machine Learning

Few-Shot Prompting

Few-shot prompting provides a language model with a small number of input-output examples in the prompt to demonstrate the desired task format. This technique helps models understand task requirements without any fine-tuning.

Generative AI

Fine-Tuning

Fine-tuning is the process of taking a pre-trained model and continuing training on a smaller, task-specific dataset. It adapts general knowledge to specialized domains while requiring far less data and compute than training from scratch.

Deep Learning

Foundation Model

A foundation model is a large AI model trained on broad data that can be adapted to a wide range of downstream tasks. GPT-4, Claude, Gemini, and DALL-E are examples of foundation models that serve as bases for specialized applications.

Generative AI

Frozen Layers

Frozen layers are neural network layers whose weights are not updated during fine-tuning. Freezing preserves learned representations from pre-training while allowing later layers to adapt to new tasks.

Deep Learning
G16 terms

GAN

A GAN (Generative Adversarial Network) is a generative model consisting of two competing neural networks — a generator and a discriminator. GANs produce realistic synthetic data by training these networks in an adversarial game.

Generative AI

Gaussian Process

A Gaussian process is a probabilistic model that defines a distribution over functions, providing both predictions and uncertainty estimates. Gaussian processes are used in Bayesian optimization and surrogate modeling.

Machine Learning

Gemini

Gemini is Google's family of multimodal AI models capable of processing text, images, audio, and video. It represents Google's most advanced AI system and competes with models like GPT-4 and Claude.

Generative AI

Generative Adversarial Network

A Generative Adversarial Network is a deep learning framework where two neural networks compete: a generator creates synthetic data while a discriminator evaluates authenticity. This adversarial process produces remarkably realistic outputs.

Generative AI

Generative AI

Generative AI refers to artificial intelligence systems that can create new content including text, images, music, code, and video. Technologies like GPT, DALL-E, and Stable Diffusion have made generative AI accessible to millions.

Generative AI

Generative Model

A generative model learns the underlying data distribution and can create new data samples that resemble the training data. Examples include GANs, VAEs, diffusion models, and autoregressive language models.

Generative AI

Generative Pre-trained Transformer

A Generative Pre-trained Transformer (GPT) is a type of large language model that generates text by predicting the next token in a sequence. Pre-trained on vast text corpora, GPT models exhibit broad language understanding and generation capabilities.

Generative AI

Genetic Algorithm

A genetic algorithm is an optimization technique inspired by natural selection that evolves solutions through selection, crossover, and mutation. It is used for complex optimization problems where gradient-based methods are impractical.

Fundamentals

GPT

GPT (Generative Pre-trained Transformer) is a series of large language models developed by OpenAI that generate human-quality text. GPT models are trained to predict the next token and can perform a wide range of language tasks.

Generative AI

GPU

A GPU (Graphics Processing Unit) is a specialized processor designed for parallel computation that has become essential for training deep learning models. GPUs from NVIDIA dominate AI computing with architectures optimized for matrix operations.

AI Infrastructure

Gradient

A gradient is a vector of partial derivatives that indicates the direction and rate of steepest increase of a function. In neural networks, gradients are used to update weights in the direction that minimizes the loss function.

Deep Learning

Gradient Clipping

Gradient clipping is a technique that limits the magnitude of gradients during training to prevent exploding gradients. It is essential for stable training of deep networks and recurrent architectures.

Deep Learning

Gradient Descent

Gradient descent is an iterative optimization algorithm that adjusts model parameters in the direction that reduces the loss function. Variants include stochastic gradient descent (SGD), mini-batch SGD, and Adam.

Deep Learning

Graph Neural Network

A graph neural network (GNN) is a deep learning architecture designed to operate on graph-structured data like social networks, molecules, and knowledge graphs. GNNs learn by passing messages between connected nodes.

Deep Learning

Ground Truth

Ground truth refers to the correct, verified labels or annotations in a dataset used to train and evaluate machine learning models. The quality of ground truth directly impacts model reliability.

Data Science

Grounding

Grounding in AI refers to connecting a model's language understanding to real-world knowledge, data, or sensory experience. Grounded AI systems produce more factual and contextually relevant outputs.

Natural Language Processing
H10 terms

Hallucination

Hallucination in AI refers to when a model generates plausible-sounding but factually incorrect or fabricated information. Reducing hallucinations is a major challenge for large language models used in high-stakes applications.

Generative AI

Hate Speech Detection

Hate speech detection is the AI task of automatically identifying harmful, abusive, or discriminatory language in text. It is a key component of content moderation systems on social media platforms.

AI Applications

Heuristic

A heuristic is a practical problem-solving approach that uses rules of thumb to find good-enough solutions efficiently. In AI search algorithms, heuristics guide exploration toward promising solutions.

Fundamentals

Hidden Layer

A hidden layer is any neural network layer between the input and output layers. Hidden layers progressively transform data into increasingly abstract representations that enable complex pattern recognition.

Deep Learning

Hierarchical Clustering

Hierarchical clustering is an unsupervised method that builds a tree-like hierarchy of nested clusters. It can be agglomerative (bottom-up merging) or divisive (top-down splitting) and produces a dendrogram visualization.

Machine Learning

Hugging Face

Hugging Face is a platform and community that provides open-source tools, pre-trained models, and datasets for natural language processing and machine learning. It has become the central hub for sharing and deploying AI models.

AI Infrastructure

Human-in-the-Loop

Human-in-the-loop (HITL) is an approach where humans actively participate in the AI decision-making or training process. HITL systems combine human judgment with AI speed to improve accuracy and safety.

AI Applications

Hyperparameter

A hyperparameter is a configuration value set before training that controls the learning process, such as learning rate, batch size, or number of layers. Unlike model parameters, hyperparameters are not learned from data.

Machine Learning

Hyperparameter Tuning

Hyperparameter tuning is the process of finding optimal hyperparameter values to maximize model performance. Methods include grid search, random search, and Bayesian optimization.

Machine Learning

Hypothesis Testing

Hypothesis testing is a statistical method used to determine whether observed results are statistically significant or due to random chance. In AI, it helps validate whether model improvements are meaningful.

Data Science
I15 terms

Image Captioning

Image captioning is the AI task of generating natural language descriptions of images. It requires both visual understanding (computer vision) and text generation (NLP) capabilities.

Computer Vision

Image Classification

Image classification is the computer vision task of assigning a label to an entire image based on its visual content. Deep learning models like ResNet and Vision Transformers achieve near-human accuracy on this task.

Computer Vision

Image Generation

Image generation is the AI task of creating new images from text prompts, sketches, or other inputs. Diffusion models and GANs are the leading approaches for photorealistic image synthesis.

Generative AI

Image Segmentation

Image segmentation is the process of partitioning an image into meaningful regions or classifying each pixel into a category. It is used in medical imaging, autonomous driving, and satellite analysis.

Computer Vision

Imitation Learning

Imitation learning is a technique where an AI agent learns to perform tasks by observing and mimicking expert demonstrations. It bridges the gap between supervised learning and reinforcement learning.

Reinforcement Learning

Imputation

Imputation is the process of replacing missing data values with substituted values based on statistical methods or machine learning. Proper imputation prevents biased model training from incomplete datasets.

Data Science

In-Context Learning

In-context learning is the ability of large language models to learn new tasks from examples provided within the input prompt, without any parameter updates. This emergent capability enables flexible task adaptation at inference time.

Generative AI

Inference

Inference is the process of using a trained model to make predictions on new, unseen data. Optimizing inference speed and cost is critical for deploying AI in production applications.

Machine Learning

Information Gain

Information gain measures the reduction in entropy achieved by splitting data on a particular feature. It is the primary criterion for building decision trees and feature selection.

Machine Learning

Information Retrieval

Information retrieval is the science of searching and extracting relevant documents or data from large collections. Modern AI-powered search uses embeddings and language models to understand semantic meaning.

AI Applications

Instance Segmentation

Instance segmentation is a computer vision task that identifies each object in an image and delineates its exact pixel boundary. Unlike semantic segmentation, it distinguishes between individual instances of the same class.

Computer Vision

Instruction Tuning

Instruction tuning is a fine-tuning process that trains language models to follow natural language instructions across diverse tasks. It greatly improves a model's ability to understand and execute user requests.

Generative AI

Intelligent Agent

An intelligent agent is an autonomous entity that observes its environment through sensors and acts upon it through actuators to achieve goals. Modern AI agents combine perception, reasoning, and action in complex workflows.

AI Applications

Inverse Reinforcement Learning

Inverse reinforcement learning infers the reward function that an expert is optimizing by observing their behavior. It enables AI systems to learn goals and preferences from demonstrations.

Reinforcement Learning

IoT and AI

IoT and AI refers to the integration of artificial intelligence with Internet of Things devices to enable smart, autonomous decision-making at the edge. This combination powers smart homes, industrial IoT, and wearable health devices.

AI Applications
K6 terms

K-Means Clustering

K-Means is an unsupervised clustering algorithm that partitions data into K groups by minimizing the distance between points and their assigned cluster centroids. It is one of the most widely used clustering methods.

Machine Learning

K-Nearest Neighbors

K-Nearest Neighbors (KNN) is a simple machine learning algorithm that classifies data points based on the majority class of their K closest neighbors. KNN requires no training phase but can be computationally expensive at inference.

Machine Learning

Kernel

A kernel is a function that computes similarity between data points, often used to map data into higher-dimensional spaces. Kernels enable Support Vector Machines and other algorithms to find non-linear decision boundaries.

Machine Learning

Knowledge Distillation

Knowledge distillation is a technique where a smaller model (student) is trained to mimic the outputs of a larger model (teacher). This transfers the teacher's knowledge into a more efficient model suitable for deployment.

Deep Learning

Knowledge Graph

A knowledge graph is a structured representation of real-world entities and the relationships between them. AI systems use knowledge graphs to enhance reasoning, question answering, and recommendation systems.

AI Applications

Knowledge Representation

Knowledge representation is the field of AI concerned with encoding information about the world in a form that AI systems can use for reasoning. It includes ontologies, semantic networks, and logic-based formalisms.

Fundamentals
L12 terms

Label

A label is the target output or ground truth annotation associated with a training example in supervised learning. Models learn to predict correct labels from input features during the training process.

Machine Learning

Language Model

A language model is an AI system that learns the probability distribution of sequences of words in a language. Modern language models like GPT and Claude can generate text, answer questions, and perform complex reasoning.

Natural Language Processing

Large Language Model

A Large Language Model (LLM) is a neural network with billions of parameters trained on massive text datasets to understand and generate human language. LLMs like GPT-4, Claude, and Gemini demonstrate broad capabilities across language tasks.

Natural Language Processing

Latent Space

Latent space is a compressed, lower-dimensional representation of data learned by a model. In generative AI, navigating latent space allows smooth interpolation between data points and controlled generation.

Deep Learning

Layer

A layer is a fundamental building block of a neural network that performs a specific transformation on its input. Common layer types include dense, convolutional, recurrent, and attention layers.

Deep Learning

Lazy Learning

Lazy learning is an approach where the model delays computation until a query is made rather than building a model during training. K-Nearest Neighbors is the most well-known lazy learning algorithm.

Machine Learning

Learning Rate

The learning rate is a hyperparameter that controls the step size during gradient descent optimization. Too high a learning rate causes instability, while too low a rate leads to slow convergence.

Deep Learning

Linear Regression

Linear regression is a statistical method that models the relationship between a dependent variable and one or more independent variables using a linear equation. It is one of the simplest and most interpretable ML algorithms.

Machine Learning

Logistic Regression

Logistic regression is a classification algorithm that uses a sigmoid function to model the probability of a binary outcome. Despite its name, it is a classification method rather than a regression technique.

Machine Learning

LoRA

LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning technique that adds small trainable matrices to frozen pre-trained model weights. LoRA dramatically reduces the memory and compute required for fine-tuning large models.

Generative AI

Loss Function

A loss function is a mathematical function that quantifies how far a model's predictions are from the actual values. The model training process minimizes the loss function through optimization.

Machine Learning

LSTM

Long Short-Term Memory (LSTM) is a type of recurrent neural network architecture designed to learn long-range dependencies in sequential data. LSTMs use gate mechanisms to control information flow and avoid the vanishing gradient problem.

Deep Learning
M17 terms

Machine Learning

Machine learning is a branch of artificial intelligence where systems learn patterns from data to make predictions or decisions without being explicitly programmed. It encompasses supervised, unsupervised, and reinforcement learning approaches.

Machine Learning

Markov Chain

A Markov chain is a mathematical model describing a sequence of events where the probability of each event depends only on the current state. Markov chains are used in language modeling, page ranking, and MCMC sampling.

Fundamentals

Markov Decision Process

A Markov Decision Process (MDP) is a mathematical framework for modeling sequential decision-making problems with probabilistic outcomes. MDPs are the formal foundation for reinforcement learning algorithms.

Reinforcement Learning

Masked Autoencoder

A masked autoencoder is a self-supervised learning method that masks random patches of an image and trains the model to reconstruct them. It has proven highly effective for pre-training vision models.

Computer Vision

Masked Language Model

A masked language model is a training approach where random tokens in a sentence are hidden and the model learns to predict them from context. BERT popularized masked language modeling as a pre-training objective.

Natural Language Processing

Meta-Learning

Meta-learning, or learning to learn, is an approach where AI systems learn how to quickly adapt to new tasks from limited data. Meta-learning algorithms optimize the learning process itself rather than just task performance.

Machine Learning

Minimax

Minimax is a decision-making algorithm used in adversarial settings where one player tries to maximize their score while the other minimizes it. It is the classical approach for game-playing AI systems.

Reinforcement Learning

Mixture of Experts

Mixture of Experts (MoE) is an architecture that uses multiple specialized sub-networks (experts) and a gating mechanism to route inputs to the most relevant experts. MoE enables scaling model capacity without proportionally increasing compute.

Deep Learning

MLOps

MLOps (Machine Learning Operations) is the practice of applying DevOps principles to the machine learning lifecycle, including development, deployment, monitoring, and maintenance. MLOps ensures reliable, reproducible, and scalable ML systems.

AI Infrastructure

Model Card

A model card is a documentation framework that provides essential information about a machine learning model, including its intended use, performance metrics, limitations, and ethical considerations.

AI Ethics & Safety

Model Collapse

Model collapse is a phenomenon where AI models trained on AI-generated data progressively lose diversity and quality over generations. It highlights the importance of maintaining high-quality human-generated training data.

Generative AI

Model Serving

Model serving is the process of deploying trained machine learning models to production environments where they can respond to prediction requests. Efficient serving requires optimization for latency, throughput, and cost.

AI Infrastructure

Monte Carlo Method

Monte Carlo methods are computational algorithms that use repeated random sampling to estimate mathematical results. In AI, they are used in reinforcement learning, probabilistic inference, and tree search algorithms.

Fundamentals

Multi-Agent System

A multi-agent system consists of multiple AI agents that interact, cooperate, or compete to solve complex problems. These systems model real-world scenarios like traffic management, markets, and collaborative robotics.

AI Applications

Multi-Head Attention

Multi-head attention is a mechanism that runs multiple attention operations in parallel, allowing the model to attend to different aspects of the input simultaneously. It is a core component of the Transformer architecture.

Deep Learning

Multi-Task Learning

Multi-task learning is a training approach where a model learns to perform multiple related tasks simultaneously. Sharing representations across tasks often improves performance and data efficiency.

Machine Learning

Multimodal AI

Multimodal AI refers to systems that can process and understand multiple types of data, such as text, images, audio, and video. Models like GPT-4 and Gemini are multimodal, enabling richer human-AI interaction.

Generative AI
N12 terms

N-gram

An N-gram is a contiguous sequence of N items from a text, used in language modeling and text analysis. Unigrams, bigrams, and trigrams capture local word patterns and co-occurrence statistics.

Natural Language Processing

Named Entity Recognition

Named Entity Recognition (NER) is an NLP task that identifies and classifies named entities like people, organizations, locations, and dates in text. NER is a fundamental building block for information extraction.

Natural Language Processing

Natural Language Generation

Natural Language Generation (NLG) is the AI task of producing coherent, human-readable text from structured data or prompts. Large language models have made NLG remarkably fluent and contextually appropriate.

Natural Language Processing

Natural Language Inference

Natural Language Inference (NLI) is the task of determining whether a hypothesis is entailed by, contradicts, or is neutral to a given premise. NLI benchmarks test a model's understanding of logical relationships in text.

Natural Language Processing

Natural Language Processing

Natural Language Processing (NLP) is the field of AI focused on enabling machines to understand, interpret, and generate human language. NLP powers applications from chatbots and translation to sentiment analysis and search.

Natural Language Processing

Natural Language Understanding

Natural Language Understanding (NLU) is the subfield of NLP focused on machine reading comprehension — extracting meaning, intent, and context from text. NLU is essential for virtual assistants and conversational AI.

Natural Language Processing

Neural Architecture Search

Neural Architecture Search (NAS) is an automated process for discovering optimal neural network architectures for a given task. NAS uses search algorithms to explore vast design spaces that would be impractical to navigate manually.

Deep Learning

Neural Network

A neural network is a computing system inspired by biological neurons that processes information through interconnected layers of nodes. Neural networks are the foundation of deep learning and power most modern AI applications.

Deep Learning

Neural Radiance Field

A Neural Radiance Field (NeRF) is a deep learning method that represents 3D scenes as continuous functions, enabling photorealistic novel view synthesis from 2D images. NeRFs have transformed 3D reconstruction and rendering.

Computer Vision

Noise

Noise in data science refers to random, irrelevant, or erroneous information in a dataset that can hinder model learning. Effective ML systems must distinguish meaningful signal from noise.

Data Science

Noise Injection

Noise injection is a regularization technique that adds random noise to inputs, weights, or gradients during training. It improves model robustness and generalization by preventing over-reliance on specific patterns.

Deep Learning

Normalization

Normalization is the process of scaling input features to a standard range or distribution to improve model training. Common techniques include min-max scaling, z-score standardization, and layer normalization.

Data Science
O10 terms

Object Detection

Object detection is a computer vision task that identifies and locates multiple objects within an image by predicting bounding boxes and class labels. YOLO, Faster R-CNN, and DETR are popular object detection models.

Computer Vision

Object Tracking

Object tracking is the computer vision task of following the movement of specific objects across consecutive frames in a video. It is essential for surveillance, autonomous driving, and sports analytics.

Computer Vision

One-Hot Encoding

One-hot encoding is a technique that converts categorical variables into binary vectors where only one element is 1 and the rest are 0. It is a standard preprocessing step for feeding categorical data to machine learning models.

Data Science

One-Shot Learning

One-shot learning is the ability of a model to learn a new concept from just a single example. It is particularly important in applications like face verification where collecting many examples per person is impractical.

Machine Learning

Online Learning

Online learning is a training paradigm where the model updates its parameters incrementally as new data arrives, rather than retraining on the entire dataset. It is essential for streaming data and dynamic environments.

Machine Learning

Open Source AI

Open source AI refers to AI models, tools, and frameworks whose source code and weights are publicly available for use, modification, and distribution. Projects like LLaMA, Mistral, and PyTorch drive AI democratization.

AI Infrastructure

OpenAI

OpenAI is an AI research company that created ChatGPT, GPT-4, and DALL-E. Founded in 2015, it has been instrumental in advancing large language models and bringing generative AI to mainstream adoption.

Generative AI

Optical Character Recognition

Optical Character Recognition (OCR) is the technology that converts images of text into machine-readable text data. Modern OCR uses deep learning to handle diverse fonts, handwriting, and document layouts.

Computer Vision

Optimization

Optimization in machine learning is the process of adjusting model parameters to minimize (or maximize) an objective function. Gradient-based optimization methods are the backbone of neural network training.

Machine Learning

Overfitting

Overfitting occurs when a model learns the training data too well, including its noise and outliers, and fails to generalize to new data. Regularization, dropout, and early stopping are common strategies to combat overfitting.

Machine Learning
P19 terms

Panoptic Segmentation

Panoptic segmentation unifies semantic and instance segmentation by assigning every pixel a semantic class and an instance identity. It provides a complete understanding of scene composition.

Computer Vision

Parameter

A parameter is a learnable variable within a model that is adjusted during training, such as weights and biases in a neural network. Large language models contain billions of parameters.

Machine Learning

Parameter-Efficient Fine-Tuning

Parameter-Efficient Fine-Tuning (PEFT) refers to techniques that adapt large models by updating only a small subset of parameters. Methods like LoRA, adapters, and prefix tuning enable fine-tuning with minimal compute.

Generative AI

Perceptron

A perceptron is the simplest type of artificial neural network consisting of a single neuron that computes a weighted sum of inputs and applies a threshold function. It is the fundamental building block of more complex networks.

Deep Learning

Perplexity

Perplexity is a metric that measures how well a language model predicts a text sequence — lower perplexity indicates better prediction. It is also the name of an AI-powered search engine that provides cited, conversational answers.

Natural Language Processing

Pipeline

A pipeline in ML is a sequence of data processing and modeling steps chained together to automate a workflow. ML pipelines include data preprocessing, feature engineering, model training, and evaluation stages.

AI Infrastructure

Policy

A policy in reinforcement learning is a function that maps states to actions, defining the agent's behavior strategy. The goal of RL is to learn an optimal policy that maximizes cumulative reward.

Reinforcement Learning

Pose Estimation

Pose estimation is the computer vision task of detecting the position and orientation of a person's body joints in images or video. It enables applications in fitness tracking, motion capture, and human-computer interaction.

Computer Vision

Positional Encoding

Positional encoding adds information about token position to input embeddings in Transformer models, which otherwise have no inherent sense of sequence order. This enables the model to understand word order and sentence structure.

Deep Learning

Pre-training

Pre-training is the initial phase of training a model on a large, general dataset before fine-tuning on specific tasks. Pre-training enables models to learn broad language or visual understanding that transfers to many applications.

Deep Learning

Precision

Precision is a classification metric measuring the proportion of true positive predictions among all positive predictions. High precision means few false positives, which is important when the cost of false alarms is high.

Machine Learning

Prediction

Prediction is the output of a trained model when given new input data. Machine learning predictions can be categorical (classification), numerical (regression), or generative (text, images).

Machine Learning

Principal Component Analysis

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms data into a new coordinate system where the greatest variance lies along the first coordinates. PCA is widely used for data visualization and noise reduction.

Data Science

Prompt

A prompt is the input text or instruction given to a language model to elicit a desired response. The quality and specificity of prompts significantly influence the relevance and accuracy of AI-generated outputs.

Generative AI

Prompt Chaining

Prompt chaining is a technique where the output of one language model call becomes the input for the next, creating a pipeline of AI reasoning steps. It enables complex workflows that exceed what a single prompt can accomplish.

Generative AI

Prompt Engineering

Prompt engineering is the practice of designing and optimizing input prompts to get the best possible responses from AI language models. Techniques include few-shot examples, chain of thought, and structured formatting.

Generative AI

Prompt Injection

Prompt injection is a security vulnerability where malicious instructions embedded in user input override or manipulate an AI system's intended behavior. Defending against prompt injection is an active area of AI security research.

AI Ethics & Safety

Pruning

Pruning is a model compression technique that removes unnecessary weights or neurons from a neural network to reduce its size and computational cost. Pruned models can be significantly smaller while maintaining most of their accuracy.

Deep Learning

PyTorch

PyTorch is an open-source deep learning framework developed by Meta that provides flexible tensor computation with GPU acceleration. It is the most popular framework for AI research due to its intuitive design and dynamic computation graphs.

AI Infrastructure
R19 terms

Random Forest

Random Forest is an ensemble learning method that trains multiple decision trees on random data subsets and combines their predictions through voting. It is robust, requires minimal tuning, and handles both classification and regression.

Machine Learning

Recall

Recall is a classification metric measuring the proportion of actual positives that were correctly identified by the model. High recall is critical in medical diagnosis and other applications where missing true positives is costly.

Machine Learning

Recommendation System

A recommendation system is an AI application that predicts and suggests items a user might be interested in. Netflix, Spotify, and Amazon use recommendation systems powered by collaborative filtering and deep learning.

AI Applications

Recurrent Neural Network

A Recurrent Neural Network (RNN) is a neural architecture designed for sequential data that maintains a hidden state across time steps. While largely superseded by Transformers, RNNs introduced the concept of memory in neural networks.

Deep Learning

Regression

Regression is a supervised learning task where the model predicts a continuous numerical value rather than a category. Examples include predicting house prices, stock returns, and temperature forecasts.

Machine Learning

Regularization

Regularization is a set of techniques that prevent overfitting by adding constraints or penalties to the model during training. Common methods include L1/L2 regularization, dropout, and early stopping.

Machine Learning

Reinforcement Learning

Reinforcement learning is a machine learning paradigm where an agent learns to make decisions by receiving rewards or penalties for its actions in an environment. It has achieved breakthroughs in game playing, robotics, and AI alignment.

Reinforcement Learning

Reinforcement Learning from Human Feedback

RLHF is a training technique that uses human preferences to fine-tune AI models, aligning their outputs with human values and expectations. RLHF is key to making language models helpful, harmless, and honest.

Generative AI

Representation Learning

Representation learning is the automatic discovery of useful data representations needed for machine learning tasks. Deep learning is fundamentally a form of representation learning that builds hierarchical feature abstractions.

Deep Learning

Residual Network

A Residual Network (ResNet) is a deep neural network architecture that uses skip connections to enable training of very deep networks. ResNets solved the vanishing gradient problem and enabled networks with hundreds of layers.

Deep Learning

Responsible AI

Responsible AI encompasses practices and principles for developing AI systems that are fair, transparent, accountable, and beneficial to society. It addresses bias, privacy, safety, and the broader social impact of AI technology.

AI Ethics & Safety

Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) is a technique that enhances language model responses by first retrieving relevant documents from a knowledge base. RAG reduces hallucinations and keeps AI responses grounded in up-to-date, factual information.

Generative AI

Reward Model

A reward model is a trained model that predicts human preferences between different AI outputs, providing a scalar reward signal. Reward models are central to RLHF and are used to align language models with human values.

Reinforcement Learning

Reward Shaping

Reward shaping is the practice of designing intermediate reward signals to guide reinforcement learning agents toward desired behaviors more efficiently. Good reward shaping accelerates training while avoiding unintended shortcuts.

Reinforcement Learning

RNN

An RNN (Recurrent Neural Network) is a class of neural networks where connections between nodes form cycles, allowing the network to maintain temporal state. While effective for sequences, RNNs struggle with long-range dependencies compared to Transformers.

Deep Learning

Robotic Process Automation

Robotic Process Automation (RPA) uses software robots to automate repetitive, rule-based business tasks like data entry and form processing. AI-enhanced RPA can handle unstructured data and make intelligent decisions.

Robotics & Automation

Robotics

Robotics is the field of engineering and AI focused on designing, building, and programming robots that can interact with the physical world. AI-powered robotics combines computer vision, planning, and motor control.

Robotics & Automation

Robustness

Robustness in AI refers to a model's ability to maintain performance when faced with unexpected inputs, adversarial attacks, or distribution shifts. Building robust AI systems is essential for reliable real-world deployment.

AI Ethics & Safety

ROC Curve

A ROC (Receiver Operating Characteristic) curve plots the true positive rate against the false positive rate at various classification thresholds. The area under the ROC curve (AUC) is a widely used metric for classifier performance.

Machine Learning
S23 terms

Sampling

Sampling in generative AI is the process of selecting tokens from a probability distribution during text generation. Different sampling strategies like top-k and top-p control the randomness and creativity of outputs.

Generative AI

Scaling Laws

Scaling laws are empirical relationships showing how model performance improves predictably with increases in model size, data, and compute. They guide decisions about resource allocation in training large AI models.

Deep Learning

Self-Attention

Self-attention is an attention mechanism where each element in a sequence computes attention scores with every other element in the same sequence. It enables Transformers to capture long-range dependencies regardless of distance.

Deep Learning

Self-Supervised Learning

Self-supervised learning is a training approach where models generate their own supervisory signals from unlabeled data. Pre-training large language models with next-token prediction is a form of self-supervised learning.

Machine Learning

Semantic Search

Semantic search uses AI to understand the meaning and intent behind queries rather than just matching keywords. It leverages embeddings and language models to return results that are conceptually relevant.

AI Applications

Semantic Similarity

Semantic similarity is a measure of how closely two pieces of text convey the same meaning. AI computes semantic similarity using vector embeddings, enabling applications like duplicate detection and recommendation.

Natural Language Processing

Semi-Supervised Learning

Semi-supervised learning uses a combination of a small amount of labeled data and a large amount of unlabeled data for training. It bridges the gap between supervised and unsupervised learning.

Machine Learning

Sentiment Analysis

Sentiment analysis is the NLP task of determining the emotional tone or opinion expressed in text — positive, negative, or neutral. It is widely used in brand monitoring, customer feedback analysis, and social media analytics.

Natural Language Processing

Sequence-to-Sequence

Sequence-to-sequence (Seq2Seq) is a model architecture that transforms one sequence into another, used in translation, summarization, and dialogue. It consists of an encoder that reads the input and a decoder that generates the output.

Deep Learning

SHAP

SHAP (SHapley Additive exPlanations) is an explainability method based on game theory that assigns each feature an importance value for a particular prediction. SHAP provides consistent, locally accurate explanations for any ML model.

AI Ethics & Safety

Sigmoid Function

The sigmoid function is an activation function that maps any input to a value between 0 and 1, making it useful for binary classification outputs. It has been largely replaced by ReLU in hidden layers but remains standard for output layers.

Deep Learning

Sim-to-Real Transfer

Sim-to-real transfer is the process of training AI models in simulation and deploying them in the real world. It is crucial in robotics where real-world training is expensive, slow, or dangerous.

Robotics & Automation

Softmax

Softmax is a function that converts a vector of raw scores into a probability distribution where all values sum to 1. It is the standard output activation for multi-class classification and attention mechanisms.

Deep Learning

Sparse Model

A sparse model activates only a subset of its parameters for each input, reducing computational cost while maintaining capacity. Mixture of Experts and pruned networks are common sparse model architectures.

Deep Learning

Speech Recognition

Speech recognition is the AI capability of converting spoken language into text. Modern systems like Whisper use deep learning to achieve near-human accuracy across multiple languages.

AI Applications

Stable Diffusion

Stable Diffusion is an open-source AI image generation model that creates images from text descriptions using a latent diffusion process. Its open nature has spurred a large community of developers and artists.

Generative AI

Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) is an optimization algorithm that updates model weights using the gradient computed from a random subset (mini-batch) of training data. SGD is computationally efficient and adds beneficial noise that helps escape local minima.

Deep Learning

Style Transfer

Style transfer is a computer vision technique that applies the artistic style of one image to the content of another. Neural style transfer uses deep learning to separate and recombine content and style representations.

Computer Vision

Supervised Learning

Supervised learning is a machine learning approach where models learn from labeled training data — input-output pairs. It is the most common ML paradigm, powering classification and regression tasks.

Machine Learning

Support Vector Machine

A Support Vector Machine (SVM) is a classification algorithm that finds the optimal hyperplane separating different classes with maximum margin. SVMs are effective for high-dimensional data and small datasets.

Machine Learning

Swarm Intelligence

Swarm intelligence is a collective behavior that emerges from groups of simple agents following local rules, inspired by natural systems like ant colonies and bird flocks. It is used in optimization and multi-robot coordination.

Robotics & Automation

Synthetic Data

Synthetic data is artificially generated data that mimics the statistical properties of real-world data. It is used to augment training datasets, protect privacy, and test models when real data is scarce or sensitive.

Data Science

System Prompt

A system prompt is a hidden instruction given to a language model that defines its behavior, persona, and constraints for a conversation. System prompts shape how AI assistants respond without being visible to end users.

Generative AI
T19 terms

Temperature

Temperature is a parameter in language model text generation that controls the randomness of output. Lower temperatures produce more deterministic, focused responses, while higher temperatures increase creativity and diversity.

Generative AI

Tensor

A tensor is a multi-dimensional array of numbers that serves as the fundamental data structure in deep learning. Scalars, vectors, and matrices are all specific cases of tensors.

Deep Learning

TensorFlow

TensorFlow is an open-source machine learning framework developed by Google that provides tools for building and deploying ML models. It supports distributed training, mobile deployment, and production serving.

AI Infrastructure

Text Classification

Text classification is the NLP task of assigning predefined categories to text documents. Applications include spam filtering, topic labeling, and content moderation.

Natural Language Processing

Text Generation

Text generation is the AI task of producing coherent, contextually relevant text, typically through autoregressive language models. Modern text generation powers chatbots, creative writing tools, and code assistants.

Generative AI

Text-to-Image

Text-to-image generation creates visual images from natural language descriptions using AI models like DALL-E, Midjourney, and Stable Diffusion. It has transformed creative workflows and content production.

Generative AI

Text-to-Speech

Text-to-speech (TTS) is the AI technology that converts written text into natural-sounding spoken audio. Modern TTS systems produce remarkably human-like voices with appropriate prosody and emotion.

AI Applications

Token

A token is the basic unit of text that a language model processes, which can be a word, subword, or character depending on the tokenizer. GPT-4 processes text in tokens, with roughly 4 characters per token in English.

Natural Language Processing

Tokenization

Tokenization is the process of splitting text into tokens that a language model can process. Modern tokenizers like BPE and SentencePiece balance vocabulary size with the ability to represent any text sequence.

Natural Language Processing

Tool Use

Tool use in AI refers to a language model's ability to interact with external tools like calculators, web browsers, code interpreters, and APIs. Tool use extends AI capabilities beyond pure text generation.

Generative AI

Top-k Sampling

Top-k sampling is a text generation strategy that restricts token selection to the k most probable next tokens. It prevents the model from selecting highly unlikely tokens while maintaining output diversity.

Generative AI

Top-p Sampling

Top-p sampling (nucleus sampling) selects from the smallest set of tokens whose cumulative probability exceeds a threshold p. It dynamically adjusts the candidate pool size based on the model's confidence.

Generative AI

TPU

A TPU (Tensor Processing Unit) is a custom-designed AI accelerator chip developed by Google specifically for neural network computations. TPUs power Google's AI services and are available through Google Cloud.

AI Infrastructure

Training Data

Training data is the dataset used to teach a machine learning model to recognize patterns and make predictions. The quality, quantity, and representativeness of training data fundamentally determine model capabilities.

Data Science

Transfer Learning

Transfer learning is the technique of applying knowledge gained from training on one task to improve performance on a different but related task. It enables powerful AI models from limited domain-specific data by leveraging pre-trained knowledge.

Machine Learning

Transformer

The Transformer is a neural network architecture based on self-attention mechanisms that processes all input positions in parallel. Introduced in 2017, it became the foundation for virtually all modern large language models and many vision models.

Deep Learning

Trustworthy AI

Trustworthy AI is an approach to building AI systems that are reliable, fair, transparent, privacy-preserving, and safe. It encompasses technical, ethical, and governance dimensions of AI development.

AI Ethics & Safety

Turing Test

The Turing Test is a measure of machine intelligence proposed by Alan Turing where a human judge evaluates whether they are conversing with a human or a machine. If the judge cannot reliably distinguish, the machine is said to exhibit intelligent behavior.

Fundamentals

Type I and Type II Error

Type I error (false positive) occurs when a model incorrectly predicts a positive outcome, while Type II error (false negative) incorrectly predicts a negative. Understanding these errors is crucial for evaluating model performance in context.

Data Science

Is AI recommending your brand?

Check if ChatGPT, Perplexity, and Gemini mention you — $9 per report.