Related Computer Vision Links
Learn Coco Computer Vision Tutorial, validate concepts with Coco Computer Vision MCQ Questions, and prepare interviews through Coco Computer Vision Interview Questions and Answers.
Computer Vision Interview
20 essential Q&A
Updated 2026
COCO
MS COCO: 20 Essential Q&A
Detection, segmentation, captions, and person keypoints—benchmark details interviewers expect.
~11 min read
20 questions
Intermediate
instancesstuffcaptionsOKS
Quick Navigation
1
What is MS COCO?
⚡ easy
Answer: Common Objects in Context—benchmark for detection, segmentation, captions, and keypoints with rich scene images.
2
Which tasks?
📊 medium
Answer: Object detection (bbox), instance seg, panoptic (thing+stuff), image captioning, person keypoints—each has metrics.
3
JSON annotations?
📊 medium
Answer: COCO format lists images, categories, annotations with bbox [x,y,w,h], segmentation polygons/RLE, area, iscrowd flag.
4
Bbox format?
⚡ easy
Answer: Top-left x,y plus width,height in pixels—convert carefully vs xyxy conventions in codebases.
5
Instance masks?
📊 medium
Answer: Often stored as RLE compression per object—Mask R-CNN training decodes to binary masks per instance.
6
Panoptic on COCO?
🔥 hard
Answer: Unifies semantic “stuff” and instance “things” with PQ metric—requires non-overlapping label assignment per pixel.
7
Captions?
📊 medium
Answer: Multiple human captions per image—evaluation with BLEU/CIDEr/SPICE; encourages descriptive models.
8
Keypoints?
📊 medium
Answer: 17 body joints for person instances—AP computed with OKS instead of IoU for matching.
9
Detection mAP on COCO?
🔥 hard
Answer: Primary AP averaged over IoU thresholds 0.5:0.05:0.95 (AP@[.5:.95]) plus AP50, AP75—stricter than VOC AP50 only.
10
Mask AP?
📊 medium
Answer: AP computed on mask IoU instead of box IoU—segmentation quality can differ from bbox AP ranking.
11
80 classes?
⚡ easy
Answer: Thing categories for detection—plus stuff classes in panoptic/stuff annotations; don’t confuse with 91 legacy lists in some code.
12
train/val/test?
📊 medium
Answer: train2017, val2017 public; test-dev hidden for leaderboard—papers report val metrics for fair comparison.
13
pycocotools?
📊 medium
Answer: Official eval code for mAP, mask IoU, RLE decode—implementations should match to reproduce leaderboard numbers.
from pycocotools.coco import COCO; coco = COCO("annotations.json")
14
Small objects?
📊 medium
Answer: COCO reports AP_S/M/L by area—models struggle on small; anchors/FPN designs target scale variance.
15
iscrowd flag?
🔥 hard
Answer: Annotations for groups/crowds where instance separation ambiguous—evaluation rules ignore or merge per protocol.
16
Eval servers?
⚡ easy
Answer: Upload predictions for test sets—prevents test overfitting; val is for iteration.
17
Relation to LVIS?
📊 medium
Answer: Long-tail vocabulary extension—similar tooling, different frequency spectrum; training often joint with COCO.
18
Image source?
⚡ easy
Answer: Flickr-licensed photos of everyday scenes—more contextual clutter than ImageNet object-centric photos.
19
Why default benchmark?
📊 medium
Answer: Challenging scale, multi-task labels, standardized API—dominant for comparing detectors and instance seg models.
20
Common pitfalls?
📊 medium
Answer: Wrong bbox convention, ignoring iscrowd, different NMS thresholds, or not using official eval—numbers won’t match papers.
COCO Cheat Sheet
Det
- AP@[.5:.95]
Seg
- RLE masks
Tools
- pycocotools
💡 Pro tip: COCO mAP is mean over IoU 0.5–0.95—not just AP50.
Full tutorial track
Go deeper with the matching tutorial chapter and code examples.