SIFT features: CV guide

Pipeline in brief

Build a scale space with Gaussian blur at multiple scales per octave.
Take Difference of Gaussians (DoG); find 3D extrema (x, y, scale).
Refine location, discard low-contrast and edge-like points.
Assign dominant orientation from gradient histograms.
Sample a canonical 16×16 neighborhood into orientation histograms → 128 floats per keypoint.

Descriptor distance

Use Euclidean (L2) or L1; BFMatcher with NORM_L2 is the usual baseline.

When to prefer SIFT

Texture-rich scenes, moderate viewpoint change, when ORB struggles with repeatability.

`SIFT_create` and `detectAndCompute`

import cv2

gray = cv2.imread("building.jpg", cv2.IMREAD_GRAYSCALE)
sift = cv2.SIFT_create(nfeatures=500, nOctaveLayers=3, contrastThreshold=0.04,
                       edgeThreshold=10, sigma=1.6)
kp, des = sift.detectAndCompute(gray, None)

print(len(kp), None if des is None else des.shape)

contrastThreshold ↑ → fewer weak keypoints. edgeThreshold ↑ → more points along elongated structures.

Detect only, then compute

kp = sift.detect(gray, None)
kp, des = sift.compute(gray, kp)

Brute-force L2 matching + ratio test

import cv2

im1 = cv2.imread("a.jpg", cv2.IMREAD_GRAYSCALE)
im2 = cv2.imread("b.jpg", cv2.IMREAD_GRAYSCALE)
sift = cv2.SIFT_create()
k1, d1 = sift.detectAndCompute(im1, None)
k2, d2 = sift.detectAndCompute(im2, None)

bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=False)
pairs = bf.knnMatch(d1, d2, k=2)

good = []
for pair in pairs:
    if len(pair) < 2:
        continue
    m, n = pair
    if m.distance < 0.7 * n.distance:
        good.append(m)

vis = cv2.drawMatches(im1, k1, im2, k2, good[:80], None, flags=2)

FLANN for larger descriptor sets

For thousands of keypoints, FLANN can be faster than exhaustive BF matching. Use KD-tree or k-means index parameters tuned to float descriptors.

import cv2
import numpy as np

FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50)
flann = cv2.FlannBasedMatcher(index_params, search_params)

d1f = d1.astype(np.float32)
d2f = d2.astype(np.float32)
pairs = flann.knnMatch(d1f, d2f, k=2)

Environment notes

If cv2.SIFT_create is missing, install a recent opencv-python (SIFT returned to the main build after the patent expired in many jurisdictions). Some older wheels required opencv-contrib-python. Always check your OpenCV version with print(cv2.__version__).

                    Takeaways
                    SIFT descriptors are float vectors—match with NORM_L2 (or L1).
Use kNN + ratio or geometry (findHomography + RANSAC) to drop outliers.
Heavier than ORB; use FLANN when matching large batches.

                

Quick FAQ

ORB is faster and uses compact binary descriptors; SIFT is often stronger on difficult pairs but costs more CPU and memory. Profile on target hardware.

OpenCV’s SIFT descriptors are already normalized to unit length in typical builds—distance metrics assume that. If you modify vectors, re-normalize before L2 matching.

Related Computer Vision Links

Pipeline in brief

Descriptor distance

When to prefer SIFT

SIFT_create and detectAndCompute

Detect only, then compute

Brute-force L2 matching + ratio test

FLANN for larger descriptor sets

Environment notes

Takeaways

Quick FAQ

SIFT vs ORB for a mobile app?

Normalize descriptors?

`SIFT_create` and `detectAndCompute`