Computer Vision Chapter 27

Stereo vision

Given two calibrated cameras with known relative pose, stereo vision recovers depth by finding pixel correspondences between rectified left/right images. Disparity (horizontal shift) converts to depth via baseline and focal length. OpenCV provides stereoCalibrate, stereoRectify, block matchers (BM, SGBM), and reprojectImageTo3D for dense point clouds. This page walks through calibration linkage, rectification, several matcher presets, and 3D reprojection—with multiple code examples.

Depth from disparity

After rectification, corresponding points lie on the same scanline. If focal length is f (pixels) and baseline is B (same units as world), depth Z ≈ f · B / d where d is disparity in pixels. OpenCV’s Q matrix from stereoRectify encodes this relationship for reprojectImageTo3D.

Stereo calibration and rectification

Assume each camera is calibrated (K1, D1, K2, D2). Collect paired chessboard views; stereoCalibrate estimates R, T from camera 1 to camera 2, then stereoRectify builds rectifying transforms and Q.

import cv2
import numpy as np

# K1, D1, K2, D2 from mono calib; image_size = (w, h)
flags = cv2.CALIB_FIX_INTRINSIC
criteria = (cv2.TERM_CRITERIA_MAX_ITER + cv2.TERM_CRITERIA_EPS, 100, 1e-5)

rms, K1o, D1o, K2o, D2o, R, T, E, F = cv2.stereoCalibrate(
    objpoints, imgpoints_l, imgpoints_r,
    K1, D1, K2, D2, image_size, criteria=criteria, flags=flags)

R1, R2, P1, P2, Q, roi1, roi2 = cv2.stereoRectify(
    K1o, D1o, K2o, D2o, image_size, R, T, alpha=0)

Python returns nine values: retval, K1, D1, K2, D2, R, T, E, F. With CALIB_FIX_INTRINSIC, K*/D* usually match inputs but must still be unpacked.

Build remap maps

map1x, map1y = cv2.initUndistortRectifyMap(K1o, D1o, R1, P1, image_size, cv2.CV_32FC1)
map2x, map2y = cv2.initUndistortRectifyMap(K2o, D2o, R2, P2, image_size, cv2.CV_32FC1)

left = cv2.imread("left.png", cv2.IMREAD_GRAYSCALE)
right = cv2.imread("right.png", cv2.IMREAD_GRAYSCALE)
left_r = cv2.remap(left, map1x, map1y, cv2.INTER_LINEAR)
right_r = cv2.remap(right, map2x, map2y, cv2.INTER_LINEAR)

alpha=0 crops valid pixels; alpha=1 keeps all pixels (may introduce invalid areas).

StereoBM (fast block matching)

stereo_bm = cv2.StereoBM_create(numDisparities=128, blockSize=15)
stereo_bm.setPreFilterCap(31)
stereo_bm.setTextureThreshold(10)
stereo_bm.setUniquenessRatio(15)
stereo_bm.setSpeckleWindowSize(100)
stereo_bm.setSpeckleRange(2)

disp_bm = stereo_bm.compute(left_r, right_r).astype(np.float32) / 16.0

StereoSGBM (quality presets)

# Preset A: balanced
sgbm_a = cv2.StereoSGBM_create(
    minDisparity=0, numDisparities=128, blockSize=5,
    P1=8 * 3 * 5**2, P2=32 * 3 * 5**2,
    disp12MaxDiff=1, uniquenessRatio=10,
    speckleWindowSize=100, speckleRange=2, mode=cv2.STEREO_SGBM_MODE_SGBM_3WAY)

# Preset B: finer but slower (larger P1/P2 for 5x5 window)
win = 7
P1, P2 = 8 * 3 * win**2, 32 * 3 * win**2
sgbm_b = cv2.StereoSGBM_create(
    minDisparity=0, numDisparities=256, blockSize=win,
    P1=P1, P2=P2, uniquenessRatio=5, speckleWindowSize=150)

disp = sgbm_a.compute(left_r, right_r).astype(np.float32) / 16.0

numDisparities must be divisible by 16. Tune blockSize (odd): larger → smoother, less detail.

Optional: WLS filter (smoother disparity)

right_matcher = cv2.ximgproc.createRightMatcher(sgbm_a)
disp_left = sgbm_a.compute(left_r, right_r)
disp_right = right_matcher.compute(right_r, left_r)

wls = cv2.ximgproc.createDisparityWLSFilter(matcher_left=sgbm_a)
wls.setLambda(8000)
wls.setSigmaColor(1.5)
disp_wls = wls.filter(disp_left, left_r, None, disp_right)

Requires opencv-contrib module ximgproc.

Reproject to XYZ

points_3d = cv2.reprojectImageTo3D(disp, Q)
mask = disp > disp.min()
cloud = points_3d[mask]  # Nx3 float32, coordinates in space of Q

Visualize disparity

disp_vis = cv2.normalize(disp, None, 0, 255, cv2.NORM_MINMAX, dtype=cv2.CV_8U)
disp_color = cv2.applyColorMap(disp_vis, cv2.COLORMAP_JET)

Takeaways

  • Rectification is mandatory for standard row-aligned matchers.
  • SGBM usually beats BM on thin structures; both need texture.
  • Use Q + valid disparity mask for meaningful 3D points.

Quick FAQ

Check exposure sync, rectification quality, and increase texture (projected pattern) or reduce baseline if range is wrong. Wrong numDisparities clips true shifts.

Express baseline in meters and use consistent units in calibration; square_size in object points must match real board for metric scale.