Related Computer Vision Links
Learn Image Basics Computer Vision Tutorial, validate concepts with Image Basics Computer Vision MCQ Questions, and prepare interviews through Image Basics Computer Vision Interview Questions and Answers.
Computer Vision Interview
20 essential Q&A
Updated 2026
image basics
Image Processing Basics: 20 Essential Q&A
Digital image fundamentals—how pixels, sampling, quantization, and storage show up in interviews.
~10 min read
20 questions
Beginner
pixels resolution channels JPEG/PNG NumPy/OpenCV
Quick Navigation
1. What is a digital image?
2. What is a pixel?
3. Sampling vs quantization
4. Resolution & aspect ratio
5. Image channels
6. Grayscale from RGB
7. Bit depth & dynamic range
2. What is a pixel?
3. Sampling vs quantization
4. Resolution & aspect ratio
5. Image channels
6. Grayscale from RGB
7. Bit depth & dynamic range
1
What is a digital image in computer vision?
⚡ easy
Answer: A 2D (or 2D+channels) grid of samples where each cell is a pixel storing numeric intensity or color. It is a discrete approximation of a continuous scene after capture by a sensor and analog-to-digital conversion.
2
What is a pixel?
⚡ easy
Answer: The smallest addressable element of a raster image. Each pixel holds one or more values (e.g. gray level or R,G,B). Spatially, pixels sit on a regular grid; physically, they correspond to sensor photosites plus processing (demosaicing for color cameras).
3
Explain sampling and quantization.
📊 medium
Answer: Sampling chooses discrete spatial locations (grid resolution). Quantization maps continuous intensity to finite levels (bit depth). Together they convert a continuous image to digital form and introduce spatial and intensity approximation error.
4
What is image resolution?
⚡ easy
Answer: Usually the grid size width × height in pixels (e.g. 1920×1080). Higher resolution preserves finer detail but costs memory and compute. Aspect ratio is width/height; changing resolution without preserving ratio stretches content.
5
What are color channels?
⚡ easy
Answer: Separate 2D arrays (or stacked planes) per color component—commonly R, G, B for display. Grayscale has one channel. Multispectral/hyperspectral images have many bands beyond visible RGB.
6
How is grayscale often computed from RGB?
⚡ easy
Answer: A weighted sum approximating luminance, e.g. 0.299R + 0.587G + 0.114B (ITU-R BT.601) or simpler averages for rough work. Weights reflect human sensitivity to green; the exact formula depends on standard and use case.
7
What is bit depth? Why does it matter?
📊 medium
Answer: Bits per channel (e.g. 8-bit → 256 levels). Higher depth reduces banding and helps medical/raw workflows; 8-bit uint is standard for web and many CV datasets. HDR may use 16/32-bit float linear pipelines before tone mapping.
8
How are pixel coordinates usually indexed?
⚡ easy
Answer: Often (row, col) or (y, x) with origin at top-left, row increasing downward—matching matrix indexing in NumPy/OpenCV. Be careful when converting to math coordinates where y may increase upward.
9
What does tensor shape (H, W, C) mean?
📊 medium
Answer: Height (rows), width (columns), channels—typical for NumPy/OpenCV images. PyTorch often uses (N, C, H, W) for batches. Interviews check you can transpose between layouts without mixing H/W.
10
Raster vs vector graphics?
⚡ easy
Answer: Raster: pixel grid (photos, textures). Vector: curves/paths (SVG, fonts)—infinite resolution until rasterized. CV pipelines usually consume raster tensors; vector assets are rasterized for learning.
11
When choose JPEG vs PNG?
⚡ easy
Answer: JPEG: photos, smaller files, lossy, poor for sharp edges/text. PNG: lossless, transparency, screenshots and graphics. For repeated ML saves, beware JPEG compression artifacts affecting edges and noise.
12
What problems can lossy compression cause for CV?
📊 medium
Answer: Blocking, ringing, color bleeding—especially around edges. Models may overfit artifact patterns. For training data, prefer lossless or high-quality JPEG; for deployment, know your camera/codec pipeline.
13
What is aliasing when downsampling?
📊 medium
Answer: High-frequency detail folds into low frequencies as moiré or jaggies if you shrink without low-pass filtering. Fix: blur then downsample or use good resampling (area interpolation for downscaling in OpenCV).
14
Nearest-neighbor vs bilinear interpolation?
📊 medium
Answer: Nearest: fast, blocky, preserves original values. Bilinear: smooths using 4 neighbors, better for resizing/rotation but blurs fine detail. Bicubic is smoother still; choice affects augmentation and geometric transforms.
15
Typical dtypes for images in NumPy?
⚡ easy
Answer: uint8 [0,255] most common. Float images may be [0,1] or [0,255] depending on library—always normalize consistently before math or neural nets.
import numpy as np
img = np.zeros((480, 640, 3), dtype=np.uint8) # H,W,C
16
Why does OpenCV use BGR?
⚡ easy
Answer: Historical reasons;
imread returns BGR order. Convert to RGB for matplotlib or PIL-centric code: cv2.cvtColor(img, cv2.COLOR_BGR2RGB). Mixing orders is a common interview “debugging” trap.
import cv2
bgr = cv2.imread('x.jpg')
rgb = cv2.cvtColor(bgr, cv2.COLOR_BGR2RGB)
17
What is the alpha channel?
⚡ easy
Answer: Per-pixel opacity for compositing (RGBA). Not always present. When loading to 3-channel models, you often drop alpha or premultiply RGB depending on graphics pipeline.
18
What does an image histogram show?
📊 medium
Answer: The distribution of pixel intensities (per channel or gray). Useful for exposure diagnosis, thresholding intuition, and contrast enhancement—foundation for histogram equalization (covered in later chapters).
19
How does a video relate to images?
⚡ easy
Answer: A sequence of frames (2D images) sampled in time with a frame rate (FPS). Temporal redundancy enables compression and tracking; many CV models treat frames independently at first.
20
What is EXIF metadata?
⚡ easy
Answer: Embedded tags in JPEG/TIFF: orientation, camera settings, timestamp, GPS. The orientation tag can rotate images—some loaders ignore it, causing inconsistent training data; preprocess to canonical orientation.
Image Basics Cheat Sheet
Representation
- Grid of pixels
- Sampling + quantization
- H×W×C / dtypes
Quality
- Resolution & aspect
- Aliasing on resize
- JPEG artifacts
Code pitfalls
- BGR vs RGB
- float range [0,1] vs [0,255]
- (row,col) vs (x,y)
💡 Pro tip: State image shape, dtype, and color order before any algorithm.
Full tutorial track
Go deeper with the matching tutorial chapter and code examples.