Computer Vision Interview 20 essential Q&A Updated 2026
GANs

GANs: 20 Essential Q&A

Two-player game: fake realistic samples while a critic learns to spot fakes—foundation of modern image generation.

~12 min read 20 questions Advanced
generatordiscriminatormin-maxmode collapse
1 What is a GAN? ⚡ easy
Answer: Generative model with generator G(z) making samples and discriminator D(x) judging real vs fake—trained adversarially.
2 State the min-max game. 🔥 hard
Answer: G minimizes log(1−D(G(z))) while D maximizes log D(x)+log(1−D(G(z)))—equivalent to Jensen–Shannon related objectives in classic formulation.
3 Role of generator? 📊 medium
Answer: Maps noise z (latent) to data space—should match real data distribution at optimum.
# min_G max_D V(D,G) — alternate k steps on D, 1 on G
4 Role of discriminator? 📊 medium
Answer: Binary classifier estimating probability “real”—provides training signal to G via gradient through D.
5 Nash equilibrium? 🔥 hard
Answer: At optimum (ideal), p_G = p_data and D = ½ on generated samples—hard to reach in practice with finite capacity and SGD.
6 Why unstable training? 📊 medium
Answer: Oscillating dynamics, vanishing gradients when D too good, or D too weak—need balanced updates and architecture tricks.
7 What is mode collapse? 📊 medium
Answer: G outputs few varieties ignoring diversity—D cannot push G to cover all modes; minibatch discrimination and unrolled GANs mitigate.
8 DCGAN guidelines? 📊 medium
Answer: Strided convolutions, BatchNorm, no FC except input/output, ReLU in G, LeakyReLU in D—empirical recipe for stable conv GANs.
9 WGAN / WGAN-GP? 🔥 hard
Answer: Use Wasserstein distance with Lipschitz critic (weight clip or gradient penalty)—smoother training signal than JS when distributions disjoint.
10 LSGAN / hinge? 📊 medium
Answer: Replace sigmoid BCE with least-squares or hinge losses—often more stable gradients in practice.
11 Conditional GAN? 📊 medium
Answer: Both G and D conditioned on label, text, or image—enables targeted generation (class-conditional faces, etc.).
12 pix2pix? 📊 medium
Answer: Paired image-to-image with U-Net G and PatchGAN D—L1 + adversarial loss for aligned translation (maps→aerial).
13 CycleGAN? 🔥 hard
Answer: Unpaired domains with cycle consistency L_cycle(G,F)—horse↔zebra without paired data.
14 StyleGAN idea? 📊 medium
Answer: Map latent through learned affine transforms per layer to control coarse-to-fine style—high-quality face generation.
15 FID / Inception Score? 📊 medium
Answer: IS measures classifiable diversity of G samples; FID compares Inception feature statistics to real—lower FID is better.
16 Signs of bad training? ⚡ easy
Answer: D loss → 0 instantly, G loss frozen, identical outputs, exploding gradients—tune LR, label smoothing, TTUR.
17 Spectral normalization? 🔥 hard
Answer: Constrain D’s Lipschitz constant per layer—alternative to WGAN-GP for stable critic.
18 Data needs? ⚡ easy
Answer: Large diverse sets for photorealism; data augmentation and balancing classes help conditional GANs.
19 GAN vs diffusion today? 📊 medium
Answer: Diffusion often wins on diversity and training stability for images; GANs still valued for fast sampling and some domains.
20 Deepfakes concern? ⚡ easy
Answer: Misinformation and consent—watermarking, detection models, and policy; same tech powers legitimate VFX and data synthesis.

GAN Cheat Sheet

Players
  • G vs D
Issues
  • Mode collapse
  • Balance
Metrics
  • FID

💡 Pro tip: Mention equilibrium, mode collapse, and WGAN/FID.

Full tutorial track

Go deeper with the matching tutorial chapter and code examples.