Computer Vision Chapter 23

Kalman filter

The Kalman filter estimates a hidden state from noisy measurements under a linear dynamical model with Gaussian noise. Each step has predict (time update using motion model) and correct (fuse new observation). In tracking, the state might be position and velocity of a box center; measurements come from a detector or tracker. OpenCV’s KalmanFilter class implements the linear Kalman recursions in C++ with Python bindings.

State, model, noise

Discrete-time model: xk = F xk−1 + w with process noise w; measurement zk = H xk + v. The filter maintains mean and covariance of the state estimate. F encodes constant-velocity or constant-acceleration assumptions; H picks which state components we observe (e.g. only position).

Process noise Q

Larger Q → trust motion model less, follow measurements more (jittery but responsive).

Measurement noise R

Larger R → smoother state, slower to react to true maneuvers.

2D centroid with constant velocity

State [cx, cy, vx, vy]ᵀ; measurement [cx, cy]ᵀ. Time step assumes unit frame interval (scale F if you know Δt).

import cv2
import numpy as np

kf = cv2.KalmanFilter(4, 2)
kf.transitionMatrix = np.array([
    [1, 0, 1, 0],
    [0, 1, 0, 1],
    [0, 0, 1, 0],
    [0, 0, 0, 1],
], dtype=np.float32)
kf.measurementMatrix = np.array([
    [1, 0, 0, 0],
    [0, 1, 0, 0],
], dtype=np.float32)

kf.processNoiseCov = np.eye(4, dtype=np.float32) * 1e-2
kf.measurementNoiseCov = np.eye(2, dtype=np.float32) * 1e-1
kf.errorCovPost = np.eye(4, dtype=np.float32)

# First observation (e.g. detector center)
z = np.array([[120.0], [80.0]], dtype=np.float32)
kf.statePost = np.array([[z[0,0]], [z[1,0]], [0], [0]], dtype=np.float32)

pred = kf.predict()
z2 = np.array([[125.0], [82.0]], dtype=np.float32)
est = kf.correct(z2)
# est[0:2] = smoothed center

Per-frame loop pattern

# Each video frame:
prediction = kf.predict()
if detection_available:
    z = np.array([[mx], [my]], dtype=np.float32)
    state = kf.correct(z)
else:
    state = prediction  # coast on prediction (occlusion)

Long occlusions need gating or a separate re-acquisition module—pure Kalman drift grows without measurements.

Beyond linear Kalman

Nonlinear motion or sensors use Extended Kalman Filter (EKF) or Unscented Kalman Filter (UKF). Particle filters handle multi-modal uncertainty. Libraries like filterpy offer EKF/UKF in Python; OpenCV focuses on the linear case.

Takeaways

  • Predict then correct each step; tune Q and R for your noise and frame rate.
  • Great for smoothing box centers before association (SORT).
  • Use EKF/UKF when models are nonlinear (camera projection, IMU fusion).

Quick FAQ

Check matrix dimensions, symmetry of covariances, and that measurementMatrix matches measurement shape. Numerical issues—try smaller process noise or double precision.

Extend state to [cx, cy, w, h, vx, vy, …] with block-diagonal F or separate filters per quantity; measurements become box from detector.