AI Glossary

Every AI term,
explained simply.

Deep Learning

Autoencoder

An autoencoder is a neural network trained to compress input data into a compact representation and then reconstruct it. Autoencoders are used for dimensionality reduction, denoising, and learning latent representations.

Deep Learning

AutoML

Automated Machine Learning (AutoML) is the process of automating the end-to-end pipeline of applying machine learning, including feature engineering, model selection, and hyperparameter tuning. AutoML democratizes AI by reducing the expertise required.

Machine Learning

Autonomous Systems

Bounding Box

A bounding box is a rectangular border drawn around an object in an image to indicate its location and extent. Bounding boxes are the primary output format for object detection models.

Computer Vision

Byte Pair Encoding

Curriculum Learning

Reinforcement Learning

Extractive Summarization

Frozen Layers

Frozen layers are neural network layers whose weights are not updated during fine-tuning. Freezing preserves learned representations from pre-training while allowing later layers to adapt to new tasks.

Deep Learning

G16 terms

GAN

A GAN (Generative Adversarial Network) is a generative model consisting of two competing neural networks — a generator and a discriminator. GANs produce realistic synthetic data by training these networks in an adversarial game.

Generative AI

Gaussian Process

A Gaussian process is a probabilistic model that defines a distribution over functions, providing both predictions and uncertainty estimates. Gaussian processes are used in Bayesian optimization and surrogate modeling.

Machine Learning

Gemini

Gemini is Google's family of multimodal AI models capable of processing text, images, audio, and video. It represents Google's most advanced AI system and competes with models like GPT-4 and Claude.

Generative AI

Generative Adversarial Network

Natural Language Processing

H10 terms

Hallucination

Hallucination in AI refers to when a model generates plausible-sounding but factually incorrect or fabricated information. Reducing hallucinations is a major challenge for large language models used in high-stakes applications.

Generative AI

Hate Speech Detection

Hate speech detection is the AI task of automatically identifying harmful, abusive, or discriminatory language in text. It is a key component of content moderation systems on social media platforms.

AI Applications

Heuristic

A heuristic is a practical problem-solving approach that uses rules of thumb to find good-enough solutions efficiently. In AI search algorithms, heuristics guide exploration toward promising solutions.

Fundamentals

Hidden Layer

A hidden layer is any neural network layer between the input and output layers. Hidden layers progressively transform data into increasingly abstract representations that enable complex pattern recognition.

Deep Learning

Hierarchical Clustering

Hierarchical clustering is an unsupervised method that builds a tree-like hierarchy of nested clusters. It can be agglomerative (bottom-up merging) or divisive (top-down splitting) and produces a dendrogram visualization.

Machine Learning

Hugging Face

Hugging Face is a platform and community that provides open-source tools, pre-trained models, and datasets for natural language processing and machine learning. It has become the central hub for sharing and deploying AI models.

AI Infrastructure

Human-in-the-Loop

Human-in-the-loop (HITL) is an approach where humans actively participate in the AI decision-making or training process. HITL systems combine human judgment with AI speed to improve accuracy and safety.

AI Applications

Hyperparameter

A hyperparameter is a configuration value set before training that controls the learning process, such as learning rate, batch size, or number of layers. Unlike model parameters, hyperparameters are not learned from data.

Machine Learning

Hyperparameter Tuning

Hyperparameter tuning is the process of finding optimal hyperparameter values to maximize model performance. Methods include grid search, random search, and Bayesian optimization.

Machine Learning

Hypothesis Testing

J3 terms

Jaccard Index

The Jaccard index is a similarity metric that measures the overlap between two sets by dividing the size of their intersection by the size of their union. It is commonly used in object detection evaluation and text similarity.

Data Science

JAX

JAX is a Google-developed numerical computing library that combines NumPy-like syntax with automatic differentiation and GPU/TPU acceleration. JAX is increasingly popular for high-performance machine learning research.

AI Infrastructure

Joint Probability

Joint probability is the probability of two or more events occurring simultaneously. Understanding joint probability distributions is fundamental to probabilistic machine learning and Bayesian inference.

LSTM

Long Short-Term Memory (LSTM) is a type of recurrent neural network architecture designed to learn long-range dependencies in sequential data. LSTMs use gate mechanisms to control information flow and avoid the vanishing gradient problem.

Deep Learning

M17 terms

Machine Learning

Machine learning is a branch of artificial intelligence where systems learn patterns from data to make predictions or decisions without being explicitly programmed. It encompasses supervised, unsupervised, and reinforcement learning approaches.

Machine Learning

Markov Chain

A Markov chain is a mathematical model describing a sequence of events where the probability of each event depends only on the current state. Markov chains are used in language modeling, page ranking, and MCMC sampling.

Fundamentals

Markov Decision Process

A Markov Decision Process (MDP) is a mathematical framework for modeling sequential decision-making problems with probabilistic outcomes. MDPs are the formal foundation for reinforcement learning algorithms.

Reinforcement Learning

Masked Autoencoder

A masked autoencoder is a self-supervised learning method that masks random patches of an image and trains the model to reconstruct them. It has proven highly effective for pre-training vision models.

Computer Vision

Masked Language Model

A masked language model is a training approach where random tokens in a sentence are hidden and the model learns to predict them from context. BERT popularized masked language modeling as a pre-training objective.

Natural Language Processing

Meta-Learning

Meta-learning, or learning to learn, is an approach where AI systems learn how to quickly adapt to new tasks from limited data. Meta-learning algorithms optimize the learning process itself rather than just task performance.

Machine Learning

Minimax

Minimax is a decision-making algorithm used in adversarial settings where one player tries to maximize their score while the other minimizes it. It is the classical approach for game-playing AI systems.

Reinforcement Learning

Mixture of Experts

Mixture of Experts (MoE) is an architecture that uses multiple specialized sub-networks (experts) and a gating mechanism to route inputs to the most relevant experts. MoE enables scaling model capacity without proportionally increasing compute.

Deep Learning

MLOps

MLOps (Machine Learning Operations) is the practice of applying DevOps principles to the machine learning lifecycle, including development, deployment, monitoring, and maintenance. MLOps ensures reliable, reproducible, and scalable ML systems.

AI Infrastructure

Model Card

A model card is a documentation framework that provides essential information about a machine learning model, including its intended use, performance metrics, limitations, and ethical considerations.

AI Ethics & Safety

Model Collapse

Model collapse is a phenomenon where AI models trained on AI-generated data progressively lose diversity and quality over generations. It highlights the importance of maintaining high-quality human-generated training data.

Generative AI

Model Serving

Model serving is the process of deploying trained machine learning models to production environments where they can respond to prediction requests. Efficient serving requires optimization for latency, throughput, and cost.

AI Infrastructure

Monte Carlo Method

Monte Carlo methods are computational algorithms that use repeated random sampling to estimate mathematical results. In AI, they are used in reinforcement learning, probabilistic inference, and tree search algorithms.

Fundamentals

Multi-Agent System

A multi-agent system consists of multiple AI agents that interact, cooperate, or compete to solve complex problems. These systems model real-world scenarios like traffic management, markets, and collaborative robotics.

AI Applications

Multi-Head Attention

Multi-head attention is a mechanism that runs multiple attention operations in parallel, allowing the model to attend to different aspects of the input simultaneously. It is a core component of the Transformer architecture.

Deep Learning

Multi-Task Learning

Multi-task learning is a training approach where a model learns to perform multiple related tasks simultaneously. Sharing representations across tasks often improves performance and data efficiency.

Machine Learning

Multimodal AI

Multimodal AI refers to systems that can process and understand multiple types of data, such as text, images, audio, and video. Models like GPT-4 and Gemini are multimodal, enabling richer human-AI interaction.

Generative AI

N12 terms

N-gram

An N-gram is a contiguous sequence of N items from a text, used in language modeling and text analysis. Unigrams, bigrams, and trigrams capture local word patterns and co-occurrence statistics.

Natural Language Processing

Named Entity Recognition

Named Entity Recognition (NER) is an NLP task that identifies and classifies named entities like people, organizations, locations, and dates in text. NER is a fundamental building block for information extraction.

Natural Language Processing

Natural Language Generation

Natural Language Generation (NLG) is the AI task of producing coherent, human-readable text from structured data or prompts. Large language models have made NLG remarkably fluent and contextually appropriate.

Natural Language Processing

Natural Language Inference

Natural Language Inference (NLI) is the task of determining whether a hypothesis is entailed by, contradicts, or is neutral to a given premise. NLI benchmarks test a model's understanding of logical relationships in text.

Natural Language Processing

Natural Language Processing (NLP) is the field of AI focused on enabling machines to understand, interpret, and generate human language. NLP powers applications from chatbots and translation to sentiment analysis and search.

Natural Language Processing

Natural Language Understanding

Natural Language Understanding (NLU) is the subfield of NLP focused on machine reading comprehension — extracting meaning, intent, and context from text. NLU is essential for virtual assistants and conversational AI.

Natural Language Processing

Neural Architecture Search

Neural Architecture Search (NAS) is an automated process for discovering optimal neural network architectures for a given task. NAS uses search algorithms to explore vast design spaces that would be impractical to navigate manually.

Deep Learning

Neural Network

A neural network is a computing system inspired by biological neurons that processes information through interconnected layers of nodes. Neural networks are the foundation of deep learning and power most modern AI applications.

Deep Learning

Neural Radiance Field

A Neural Radiance Field (NeRF) is a deep learning method that represents 3D scenes as continuous functions, enabling photorealistic novel view synthesis from 2D images. NeRFs have transformed 3D reconstruction and rendering.

Computer Vision

Noise

Noise in data science refers to random, irrelevant, or erroneous information in a dataset that can hinder model learning. Effective ML systems must distinguish meaningful signal from noise.

Data Science

Noise Injection

Noise injection is a regularization technique that adds random noise to inputs, weights, or gradients during training. It improves model robustness and generalization by preventing over-reliance on specific patterns.

Deep Learning

Normalization

Optimization

Optimization in machine learning is the process of adjusting model parameters to minimize (or maximize) an objective function. Gradient-based optimization methods are the backbone of neural network training.

Machine Learning

Overfitting

Q4 terms

Q-Learning

Q-learning is a model-free reinforcement learning algorithm that learns the value of actions in states to find an optimal policy. It uses a Q-table or neural network to estimate expected cumulative rewards for each state-action pair.

Reinforcement Learning

Quantization

Quantization is a technique that reduces model size and speeds up inference by converting high-precision weights to lower precision (e.g., 32-bit to 4-bit). It enables large models to run on consumer hardware with minimal accuracy loss.

AI Infrastructure

Query in Attention

In the attention mechanism, a query is a vector representing what information the current position is looking for. Queries interact with keys and values to compute attention weights that determine which parts of the input to focus on.

Deep Learning

Question Answering

Robotics & Automation

Synthetic Data

Synthetic data is artificially generated data that mimics the statistical properties of real-world data. It is used to augment training datasets, protect privacy, and test models when real data is scarce or sensitive.

Data Science

System Prompt

Turing Test

The Turing Test is a measure of machine intelligence proposed by Alan Turing where a human judge evaluates whether they are conversing with a human or a machine. If the judge cannot reliably distinguish, the machine is said to exhibit intelligent behavior.

Fundamentals

Type I and Type II Error

Type I error (false positive) occurs when a model incorrectly predicts a positive outcome, while Type II error (false negative) incorrectly predicts a negative. Understanding these errors is crucial for evaluating model performance in context.

Data Science

U4 terms

Underfitting

Underfitting occurs when a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and test data. Increasing model complexity or training longer can address underfitting.

Machine Learning

Unsupervised Learning

Unsupervised learning is a machine learning approach where models discover patterns and structure in data without labeled examples. Clustering, dimensionality reduction, and anomaly detection are common unsupervised tasks.

Machine Learning

Unsupervised Pre-training

Unsupervised pre-training is the process of training a model on unlabeled data to learn general representations before fine-tuning on labeled data. It is the foundation of modern foundation models and transfer learning.

Deep Learning

Upsampling

Upsampling is a technique that increases the spatial resolution of data, commonly used in image generation and segmentation to produce higher-resolution outputs. Transposed convolutions and interpolation are common upsampling methods.

Deep Learning

V6 terms

Validation Set

A validation set is a portion of data held out from training to evaluate model performance during development and tune hyperparameters. It helps detect overfitting and guides model selection before final testing.

Data Science

Vanishing Gradient

The vanishing gradient problem occurs when gradients become extremely small during backpropagation through many layers, making it difficult to train deep networks. Skip connections and normalization techniques were developed to address this issue.

Deep Learning

Variational Autoencoder

A Variational Autoencoder (VAE) is a generative model that learns a probabilistic latent representation of data. VAEs can generate new data by sampling from the learned latent space distribution.

Generative AI

Vector

A vector is an ordered array of numbers that represents a point or direction in multi-dimensional space. In AI, vectors (embeddings) encode the semantic meaning of words, images, and other data types.

Fundamentals

Vector Database

A vector database is a specialized storage system optimized for storing, indexing, and querying high-dimensional vector embeddings. It powers semantic search, recommendation systems, and RAG applications.

AI Infrastructure

Vision Transformer

A Vision Transformer (ViT) applies the Transformer architecture to image recognition by treating image patches as tokens. ViTs have matched or exceeded CNNs on many computer vision benchmarks.

Computer Vision

W5 terms

Watermarking

AI watermarking is the technique of embedding hidden, detectable signals in AI-generated content to identify its origin. It helps distinguish AI-generated text and images from human-created content.

AI Ethics & Safety

Weight

A weight is a numerical parameter in a neural network that determines the strength of the connection between neurons. Weights are learned during training through backpropagation and gradient descent.

Deep Learning

Weight Initialization

Weight initialization is the strategy for setting initial values of neural network weights before training begins. Proper initialization (like Xavier or He initialization) prevents vanishing or exploding gradients.

Deep Learning

Word Embedding

A word embedding is a dense vector representation of a word that captures its semantic meaning and relationships to other words. Words with similar meanings are mapped to nearby points in embedding space.

Natural Language Processing

Word2Vec

Word2Vec is a pioneering neural network model that learns word embeddings from large text corpora. Developed by Google in 2013, it demonstrated that vector arithmetic on word embeddings captures semantic relationships.

Natural Language Processing

X2 terms

XAI

XAI (Explainable Artificial Intelligence) refers to methods and techniques that make AI decision-making processes transparent and interpretable to humans. XAI builds trust and enables accountability in AI systems.

AI Ethics & Safety

XGBoost

XGBoost (Extreme Gradient Boosting) is a highly optimized gradient boosting library known for its speed and performance in structured data competitions. It remains one of the most popular algorithms for tabular data.

Machine Learning

Y1 terms

YOLO

YOLO (You Only Look Once) is a real-time object detection algorithm that processes entire images in a single pass. Its speed makes it ideal for applications like autonomous driving and video surveillance.

Computer Vision

Z2 terms

Zero-Shot Learning

Zero-shot learning is the ability of a model to correctly handle tasks or recognize classes it has never been explicitly trained on. Large language models demonstrate strong zero-shot capabilities across diverse tasks.

Machine Learning

Zero-Shot Prompting

Zero-shot prompting is providing a language model with task instructions and no examples, relying on its pre-trained knowledge to perform the task. It tests the model's ability to generalize from training to novel instructions.

Generative AI

Is AI recommending your brand?

Check if ChatGPT, Perplexity, and Gemini mention you — $9 per report.

Check your brand

Every AI term,explained simply.

A/B Testing

Abstractive Summarization

Accuracy

Activation Function

Active Learning

Adam Optimizer

Adapter Layers

Adversarial Attack

Adversarial Training

Agent

Agentic AI

AGI

AI Alignment

AI Chip

AI Ethics

AI Safety

AI Visibility

AI Winter

Algorithm

Annotation

Anomaly Detection

API

Artificial General Intelligence

Artificial Intelligence

Artificial Narrow Intelligence

Artificial Superintelligence

Attention Mechanism

Autoencoder

AutoML

Autonomous Systems

Backpropagation

Bagging

Batch Normalization

Batch Size

Bayesian Network

Beam Search

Benchmark

BERT

Bias in AI

Bias-Variance Tradeoff

Bigram

Binary Classification

Boltzmann Machine

Boosting

Bounding Box

Byte Pair Encoding

Catastrophic Forgetting

Causal Inference

Chain of Thought

Chatbot

ChatGPT

Classification

Claude

Clustering

CNN

Computer Vision

Confusion Matrix

Constitutional AI

Continual Learning

Contrastive Learning

Convolutional Neural Network

Corpus

Cross-Entropy

Cross-Validation

CUDA

Curriculum Learning

Data Augmentation

Data Drift

Data Labeling

Data Lake

Data Pipeline

Data Warehouse

Decision Boundary

Decision Tree

Deep Learning

Deep Reinforcement Learning

Deepfake

Depthwise Separable Convolution

Diffusion Model

Every AI term,
explained simply.