Computer Vision Chapter 37

Optical flow

Optical flow assigns a 2D motion vector (u, v) to each pixel (dense) or to tracked points (sparse), under the brightness constancy assumption: pixel intensity stays the same along motion. Lucas–Kanade solves a local linearized system (good for corners, small motion). Pyramidal LK handles larger displacements. Gunnar Farneback estimates dense polynomial flow. Horn–Schunck adds global smoothness (global energy minimization). OpenCV exposes LK and Farneback directly—examples below.

Sparse: Lucas–Kanade + pyramid

import cv2
import numpy as np

cap = cv2.VideoCapture("clip.mp4")
ret, old = cap.read()
old_gray = cv2.cvtColor(old, cv2.COLOR_BGR2GRAY)
pts = cv2.goodFeaturesToTrack(old_gray, maxCorners=200, qualityLevel=0.01, minDistance=7, blockSize=7)

lk_params = dict(
    winSize=(21, 21),
    maxLevel=3,
    criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 30, 0.01),
)

while True:
    ret, frame = cap.read()
    if not ret:
        break
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    new_pts, st, err = cv2.calcOpticalFlowPyrLK(old_gray, gray, pts, None, **lk_params)
    good_new = new_pts[st == 1]
    good_old = pts[st == 1]
    # draw line from old to new for visualization
    old_gray = gray.copy()
    pts = good_new.reshape(-1, 1, 2)

cap.release()

Re-seed goodFeaturesToTrack periodically if tracks drift or disappear.

Dense: Farneback

flow = cv2.calcOpticalFlowFarneback(
    prev_gray, next_gray, None,
    pyr_scale=0.5, levels=3, winsize=15, iterations=3,
    poly_n=5, poly_sigma=1.2, flags=0,
)
# flow shape (H, W, 2): flow[...,0] = dx, flow[...,1] = dy

Visualize flow as HSV

mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
hsv = np.zeros((*flow.shape[:2], 3), dtype=np.uint8)
hsv[..., 0] = ang * 180 / np.pi / 2
hsv[..., 1] = 255
hsv[..., 2] = cv2.normalize(mag, None, 0, 255, cv2.NORM_MINMAX)
bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)

Horn–Schunck (idea)

Minimizes ∫ (I_x u + I_y v + I_t)² + λ(|∇u|² + |∇v|²) dx for smooth, dense flow. Classic global method; not in core OpenCV Python as a single call—implement via iterative schemes or use specialized libraries.

Takeaways

  • Sparse LK: fast, needs texture; pyramids extend range.
  • Farneback: dense field; heavier per frame.
  • Deep learning flow (RAFT, PWC-Net) often wins on accuracy for hard motion.

Quick FAQ

Along edges, only the normal component of motion is observable locally—corners and texture reduce ambiguity.

Increase pyramid levels, reduce frame interval, or use coarse-to-fine / deep optical flow.