What is Query in Attention?

Deep Learning

Query in Attention

In the attention mechanism, a query is a vector representing what information the current position is looking for. Queries interact with keys and values to compute attention weights that determine which parts of the input to focus on.

Understanding Query in Attention

The query in an attention mechanism is one of three key vectors, alongside keys and values, that together enable a transformer model to dynamically focus on relevant parts of its input when producing each output element. During self-attention, each token in the input sequence generates a query vector that is compared against all key vectors through dot-product similarity to produce attention weights, which then determine how much each value vector contributes to the output representation. This query-key-value framework, inspired by information retrieval concepts, allows the model to learn what information to look for (query), what information is available (key), and what content to retrieve (value). Multi-head attention runs this process multiple times in parallel with different learned projections, enabling the model to attend to different types of relationships simultaneously. This mechanism is the core innovation behind transformer architectures that power modern large language models and has become fundamental to natural language processing.

Is AI recommending your brand?

Find out if ChatGPT, Perplexity, and Gemini mention you when people search your industry.

Check your brand — $9

Question Answering

Back to full glossary

Query in Attention

Understanding Query in Attention

Is AI recommending your brand?

Related Deep Learning Terms

Activation Function

Adam Optimizer

Adapter Layers

Attention Mechanism

Autoencoder

Backpropagation

Batch Normalization

Batch Size