What is Top-p Sampling?

Generative AI

Top-p Sampling

Top-p sampling (nucleus sampling) selects from the smallest set of tokens whose cumulative probability exceeds a threshold p. It dynamically adjusts the candidate pool size based on the model's confidence.

Understanding Top-p Sampling

Top-p sampling, also known as nucleus sampling, dynamically selects the smallest set of tokens whose cumulative probability exceeds a threshold p, then samples from this set. Unlike top-k sampling, which always considers a fixed number of candidates, top-p adapts the candidate pool size based on the model's confidence at each step. When the model is certain, fewer tokens are considered; when uncertain, more options remain available. A typical p value of 0.9 means the model samples from tokens comprising 90% of the probability mass, effectively filtering out the long tail of unlikely tokens. This adaptive behavior produces more natural and coherent text generation compared to fixed-k approaches. Top-p sampling has become the default decoding strategy in many large language model APIs alongside temperature control.

Is AI recommending your brand?

Find out if ChatGPT, Perplexity, and Gemini mention you when people search your industry.

Check your brand — $9

TPU

Back to full glossary

Top-p Sampling

Understanding Top-p Sampling

Is AI recommending your brand?

Related Generative AI Terms

Chain of Thought

ChatGPT

Claude

Diffusion Model

Discriminator

Few-Shot Prompting

Foundation Model

GAN