Kalman filtering: CV guide

State, model, noise

Discrete-time model: x_k = F x_k−1 + w with process noise w; measurement z_k = H x_k + v. The filter maintains mean and covariance of the state estimate. F encodes constant-velocity or constant-acceleration assumptions; H picks which state components we observe (e.g. only position).

Process noise `Q`

Larger Q → trust motion model less, follow measurements more (jittery but responsive).

Measurement noise `R`

Larger R → smoother state, slower to react to true maneuvers.

2D centroid with constant velocity

State [cx, cy, vx, vy]ᵀ; measurement [cx, cy]ᵀ. Time step assumes unit frame interval (scale F if you know Δt).

import cv2
import numpy as np

kf = cv2.KalmanFilter(4, 2)
kf.transitionMatrix = np.array([
    [1, 0, 1, 0],
    [0, 1, 0, 1],
    [0, 0, 1, 0],
    [0, 0, 0, 1],
], dtype=np.float32)
kf.measurementMatrix = np.array([
    [1, 0, 0, 0],
    [0, 1, 0, 0],
], dtype=np.float32)

kf.processNoiseCov = np.eye(4, dtype=np.float32) * 1e-2
kf.measurementNoiseCov = np.eye(2, dtype=np.float32) * 1e-1
kf.errorCovPost = np.eye(4, dtype=np.float32)

# First observation (e.g. detector center)
z = np.array([[120.0], [80.0]], dtype=np.float32)
kf.statePost = np.array([[z[0,0]], [z[1,0]], [0], [0]], dtype=np.float32)

pred = kf.predict()
z2 = np.array([[125.0], [82.0]], dtype=np.float32)
est = kf.correct(z2)
# est[0:2] = smoothed center

Per-frame loop pattern

# Each video frame:
prediction = kf.predict()
if detection_available:
    z = np.array([[mx], [my]], dtype=np.float32)
    state = kf.correct(z)
else:
    state = prediction  # coast on prediction (occlusion)

Long occlusions need gating or a separate re-acquisition module—pure Kalman drift grows without measurements.

Beyond linear Kalman

Nonlinear motion or sensors use Extended Kalman Filter (EKF) or Unscented Kalman Filter (UKF). Particle filters handle multi-modal uncertainty. Libraries like filterpy offer EKF/UKF in Python; OpenCV focuses on the linear case.

                    Takeaways
                    Predict then correct each step; tune Q and R for your noise and frame rate.
Great for smoothing box centers before association (SORT).
Use EKF/UKF when models are nonlinear (camera projection, IMU fusion).

                

Quick FAQ

Check matrix dimensions, symmetry of covariances, and that measurementMatrix matches measurement shape. Numerical issues—try smaller process noise or double precision.

Extend state to [cx, cy, w, h, vx, vy, …] with block-diagonal F or separate filters per quantity; measurements become box from detector.

Related Computer Vision Links

State, model, noise

Process noise Q

Measurement noise R

2D centroid with constant velocity

Per-frame loop pattern

Beyond linear Kalman

Takeaways

Quick FAQ

State blows up?

Track width/height too?

Process noise `Q`

Measurement noise `R`