What is Bounding Box?

Computer Vision

Bounding Box

A bounding box is a rectangular border drawn around an object in an image to indicate its location and extent. Bounding boxes are the primary output format for object detection models.

Understanding Bounding Box

Bounding boxes are rectangular annotations that localize objects within images by specifying their position and extent, typically defined by the coordinates of opposite corners or a center point with width and height. They serve as the fundamental output for object detection models, from pioneering architectures like R-CNN to modern real-time detectors like YOLO and SSD. During annotation, human labelers draw bounding boxes to create ground truth labels that models learn to predict. Evaluation metrics like Intersection over Union (IoU) measure how well predicted boxes overlap with ground truth. While bounding boxes are computationally efficient and intuitive, they provide only coarse localization and cannot capture the precise shape of irregular objects, which is why computer vision has also developed instance segmentation approaches. Bounding boxes remain essential in applications like autonomous driving, surveillance, and medical imaging powered by convolutional neural networks.

Is AI recommending your brand?

Find out if ChatGPT, Perplexity, and Gemini mention you when people search your industry.

Check your brand — $9

Byte Pair Encoding

Back to full glossary

Bounding Box

Understanding Bounding Box

Is AI recommending your brand?

Related Computer Vision Terms

Computer Vision

Face Recognition

Image Captioning

Image Classification

Image Segmentation

Instance Segmentation

Masked Autoencoder

Neural Radiance Field