Related Computer Vision Links
Learn Transformations Computer Vision Tutorial, validate concepts with Transformations Computer Vision MCQ Questions, and prepare interviews through Transformations Computer Vision Interview Questions and Answers.
Computer Vision Interview
20 essential Q&A
Updated 2026
transforms
Image Transformations: 20 Essential Q&A
Geometric image transforms—when you need affine vs perspective, how warping works, and common interview pitfalls.
~11 min read
20 questions
Beginner–Intermediate
affinehomographywarpinterpolation
Quick Navigation
1
What is a geometric image transformation?
⚡ easy
Answer: A mapping that moves pixel locations—translation, rotation, scale, affine, or perspective—while optionally resampling intensities. It changes spatial layout but not the semantic label if the transform is label-consistent (e.g. bbox corners transformed too).
2
Define translation of an image.
⚡ easy
Answer: Shifting all pixels by offsets (tx, ty). Implemented by moving the sampling grid or adjusting the transform matrix with identity + translation column. Boundaries may require padding or cropping.
3
What is isotropic vs anisotropic scaling?
⚡ easy
Answer: Isotropic: same scale sx = sy preserves angles. Anisotropic: sx ≠ sy stretches content—can turn circles into ellipses. Know effect on aspect ratio for detection labels.
4
How is rotation about the origin represented in 2D?
📊 medium
Answer: Linear part is matrix [[cos θ, -sin θ],[sin θ, cos θ]]. In practice pick a rotation center (image center) via translate-rotate-translate composition. Large rotations need bigger canvas or cropping.
5
What does flipping do for ML?
⚡ easy
Answer: Horizontal flip is a common label-preserving augmentation for many object classes; vertical flip may break semantics (people, text, traffic scenes). Always validate against dataset semantics.
6
Homogeneous coordinates for 2D transforms?
📊 medium
Answer: Represent point (x,y) as (x,y,1). Allows affine maps as 3×3 matrices acting on homogeneous vectors, unifying translation with linear maps for composition.
7
What is an affine transformation?
📊 medium
Answer: Maps parallel lines to parallel lines: combination of linear transform and translation—rotation, scale, shear. Preserves ratios along lines but not necessarily lengths or angles unless constrained (similarity/euclidean).
8
How many degrees of freedom does a 2D affine map have?
📊 medium
Answer: Six (4 in the 2×2 linear part + 2 translation). You need 3 point correspondences (non-degenerate) to estimate it in general.
9
How does perspective differ from affine?
🔥 hard
Answer: Projective maps preserve collinearity but not parallelism—parallel world lines can converge in the image (vanishing points). Needed for planes viewed at an angle, document scanning, and bird’s-eye view from ground cameras.
10
What is a homography?
🔥 hard
Answer: A 3×3 projective transform (up to scale) mapping one plane to another in pinhole imaging. Relates two views of the same planar surface. Estimated from 4 point correspondences (DLT) with constraints.
11
Forward vs inverse warping?
📊 medium
Answer: Forward: map source→dest can leave holes and overlaps. Inverse: for each destination pixel, sample source via inverse map—avoids gaps and is standard in OpenCV
warp* with a chosen interpolator.
12
Why does warping need interpolation?
📊 medium
Answer: Mapped coordinates land between pixels. Nearest, bilinear, bicubic choose neighborhood weights—trade speed vs aliasing/blur. Downscaling may need prefiltering to avoid aliasing.
import cv2
M = cv2.getRotationMatrix2D((cx, cy), angle, scale)
out = cv2.warpAffine(img, M, (w, h))
13
Crop vs pad after transform?
⚡ easy
Answer: Rotation/scale can push content outside the original canvas—either expand canvas with padding (constant, reflect) or crop to a fixed size. Detection boxes must be clipped or transformed consistently.
14
Augmentation: random affine on segmentation masks?
📊 medium
Answer: Apply the same spatial map to image and mask (nearest-neighbor interpolation for label masks to avoid fractional classes). For instance segmentation, warp polygons or rasterize after transform.
15
What is image registration?
🔥 hard
Answer: Aligning two images of the same scene into a common coordinate frame—via feature matching + homography/affine, optical flow, or optimization. Used in medical imaging, panorama stitching, and super-resolution.
16
What is a similarity transform?
📊 medium
Answer: Rotation + uniform scale + translation (4 DOF in 2D). Preserves angles and ratios of lengths—good model when perspective effects are weak.
17
What is a rigid (Euclidean) transform?
⚡ easy
Answer: Rotation + translation only—preserves distances and angles (3 DOF in 2D). Models camera motion parallel to the plane or object pose without scale change.
18
How do you compose transforms?
📊 medium
Answer: Multiply their homogeneous matrices in application order (rightmost often applied first to a column vector—be consistent with your library convention).
19
OpenCV:
warpAffine vs warpPerspective?
⚡ easy
Answer: warpAffine uses a 2×3 affine map; warpPerspective uses full 3×3 homography. Choose based on whether parallelism must be preserved (affine) or full perspective correction is needed.
20
Are lens distortion and homography the same?
📊 medium
Answer: No—radial/tangential distortion is nonlinear and modeled separately (Brown-Conrady) before or jointly with pinhole projection. Undistort first, then apply homography for many planar AR/document pipelines.
Transforms Cheat Sheet
Models
- Euclidean → Similarity
- Affine (6 DOF)
- Projective / H (8 DOF)
Warping
- Inverse sampling
- Interpolation choice
- Same transform for masks
Uses
- Augmentation
- Stitching / BEV
- Undistort + pinhole
💡 Pro tip: State affine vs projective using parallelism and vanishing points.
Full tutorial track
Go deeper with the matching tutorial chapter and code examples.