Pipeline in brief
- Build a scale space with Gaussian blur at multiple scales per octave.
- Take Difference of Gaussians (DoG); find 3D extrema (x, y, scale).
- Refine location, discard low-contrast and edge-like points.
- Assign dominant orientation from gradient histograms.
- Sample a canonical 16×16 neighborhood into orientation histograms → 128 floats per keypoint.
Descriptor distance
Use Euclidean (L2) or L1; BFMatcher with NORM_L2 is the usual baseline.
When to prefer SIFT
Texture-rich scenes, moderate viewpoint change, when ORB struggles with repeatability.
SIFT_create and detectAndCompute
import cv2
gray = cv2.imread("building.jpg", cv2.IMREAD_GRAYSCALE)
sift = cv2.SIFT_create(nfeatures=500, nOctaveLayers=3, contrastThreshold=0.04,
edgeThreshold=10, sigma=1.6)
kp, des = sift.detectAndCompute(gray, None)
print(len(kp), None if des is None else des.shape)
contrastThreshold ↑ → fewer weak keypoints. edgeThreshold ↑ → more points along elongated structures.
Detect only, then compute
kp = sift.detect(gray, None)
kp, des = sift.compute(gray, kp)
Brute-force L2 matching + ratio test
import cv2
im1 = cv2.imread("a.jpg", cv2.IMREAD_GRAYSCALE)
im2 = cv2.imread("b.jpg", cv2.IMREAD_GRAYSCALE)
sift = cv2.SIFT_create()
k1, d1 = sift.detectAndCompute(im1, None)
k2, d2 = sift.detectAndCompute(im2, None)
bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=False)
pairs = bf.knnMatch(d1, d2, k=2)
good = []
for pair in pairs:
if len(pair) < 2:
continue
m, n = pair
if m.distance < 0.7 * n.distance:
good.append(m)
vis = cv2.drawMatches(im1, k1, im2, k2, good[:80], None, flags=2)
FLANN for larger descriptor sets
For thousands of keypoints, FLANN can be faster than exhaustive BF matching. Use KD-tree or k-means index parameters tuned to float descriptors.
import cv2
import numpy as np
FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
d1f = d1.astype(np.float32)
d2f = d2.astype(np.float32)
pairs = flann.knnMatch(d1f, d2f, k=2)
Environment notes
If cv2.SIFT_create is missing, install a recent opencv-python (SIFT returned to the main build after the patent expired in many jurisdictions). Some older wheels required opencv-contrib-python. Always check your OpenCV version with print(cv2.__version__).
Takeaways
- SIFT descriptors are float vectors—match with
NORM_L2(or L1). - Use kNN + ratio or geometry (
findHomography+ RANSAC) to drop outliers. - Heavier than ORB; use FLANN when matching large batches.