Related Computer Vision Links
Learn Sift Computer Vision Tutorial, validate concepts with Sift Computer Vision MCQ Questions, and prepare interviews through Sift Computer Vision Interview Questions and Answers.
Computer Vision Interview
20 essential Q&A
Updated 2026
SIFT
SIFT: 20 Essential Q&A
Difference-of-Gaussians, keypoint refinement, and why SIFT dominated matching for years.
~12 min read
20 questions
Advanced
DoGoctaves128-Dratio test
Quick Navigation
1
What is SIFT?
⚡ easy
Answer: Scale-Invariant Feature Transform—detects blob-like keypoints in scale-space and builds a 128-D gradient-orientation histogram descriptor; robust to scale, rotation, moderate viewpoint/lighting.
2
What is Difference of Gaussians (DoG)?
📊 medium
Answer: DoG = G(σ1)−G(σ2) approximates scale-normalized LoG—cheap way to find blob-like structures across scales.
3
What is an octave?
📊 medium
Answer: Series of images downsampled by 2 with several σ levels per octave—covers large scale range efficiently.
4
How are keypoints detected?
🔥 hard
Answer: 3×3×3 neighborhood search for scale-space extrema (max/min) in DoG volume—candidate keypoints.
5
Refinement and edge rejection?
🔥 hard
Answer: Taylor expansion fit for subpixel location and scale; reject low contrast; use Hessian of DoG to reject edge-like unstable peaks (ratio of principal curvatures).
6
Orientation histogram?
📊 medium
Answer: Weighted gradient orientations in neighborhood; peak(s) define canonical rotation—descriptor becomes rotation invariant.
7
How is the descriptor built?
📊 medium
Answer: 16×16 window into 4×4 cells; each cell has 8-bin orientation histogram of gradients; 4×4×8 = 128 values, normalized.
8
Why 4×4 grid?
⚡ easy
Answer: Balances spatial layout (localization) vs distinctiveness; finer grid more sensitive to deformation.
9
Why normalize twice?
📊 medium
Answer: L2 normalize, clip large values to reduce illumination dominance, renormalize—improves robustness to affine lighting.
10
What is RootSIFT?
📊 medium
Answer: Apply square root to L1-normalized SIFT then L2 normalize—uses Hellinger kernel implicitly; often improves retrieval.
11
SIFT invariances?
📊 medium
Answer: Scale + rotation; approximate affine with dominant orientation; not fully viewpoint invariant for strong 3D perspective.
12
SIFT vs ORB speed?
⚡ easy
Answer: SIFT heavier (float descriptor, pyramid DoG); ORB binary + FAST—ORB much faster on embedded/CPU.
13
SIFT patents?
⚡ easy
Answer: Were encumbered in US until expired (~2020); OpenCV contrib had nonfree flag—now widely usable.
14
Typical matching?
📊 medium
Answer: L2 or cosine on float vectors; ratio test + RANSAC for geometry.
15
Contrast threshold?
⚡ easy
Answer: Filters weak DoG extrema—reduces unstable keypoints on flat noise.
16
Why DoG approximates LoG?
📊 medium
Answer: Mathematical identity: DoG with σ ratio ~√2 approximates σ²∇²G up to scale—cheap blob detector.
17
Color SIFT?
🔥 hard
Answer: Compute SIFT on color channels or opponent color spaces for extra discriminability—more dimensions or fused descriptors.
18
PCA-SIFT?
🔥 hard
Answer: Project gradient patch to lower-dim PCA basis—smaller descriptor; less common now than vanilla SIFT or learned features.
19
OpenCV?
⚡ easy
Answer:
SIFT_create() in cv2 (main module after patent expiry); returns keypoints + descriptors.
20
Limitations?
📊 medium
Answer: Computation cost, repetitive texture ambiguities, limited with strong motion blur or specular highlights—deep features may win with data.
SIFT Cheat Sheet
Detect
- DoG extrema
- Subpixel + reject
Describe
- 4×4 × 8 orient
- Normalize ×2
Match
- L2 + ratio
- RANSAC
💡 Pro tip: DoG finds scale; orientation hist fixes rotation; 128-D is spatial pooling of gradients.
Full tutorial track
Go deeper with the matching tutorial chapter and code examples.