GANs: 20 Essential Q&A

Question 1

1 What is a GAN? ⚡ easy

Answer

Answer: Generative model with generator G(z) making samples and discriminator D(x) judging real vs fake—trained adversarially.

Question 2

2 State the min-max game. 🔥 hard

Answer

Answer: G minimizes log(1−D(G(z))) while D maximizes log D(x)+log(1−D(G(z)))—equivalent to Jensen–Shannon related objectives in classic formulation.

Question 3

3 Role of generator? 📊 medium

Answer

Answer: Maps noise z (latent) to data space—should match real data distribution at optimum.

Question 4

4 Role of discriminator? 📊 medium

Answer

Answer: Binary classifier estimating probability “real”—provides training signal to G via gradient through D.

Question 5

5 Nash equilibrium? 🔥 hard

Answer

Answer: At optimum (ideal), p_G = p_data and D = ½ on generated samples—hard to reach in practice with finite capacity and SGD.

Question 6

6 Why unstable training? 📊 medium

Answer

Answer: Oscillating dynamics, vanishing gradients when D too good, or D too weak—need balanced updates and architecture tricks.

Question 7

7 What is mode collapse? 📊 medium

Answer

Answer: G outputs few varieties ignoring diversity—D cannot push G to cover all modes; minibatch discrimination and unrolled GANs mitigate.

Question 8

8 DCGAN guidelines? 📊 medium

Answer

Answer: Strided convolutions, BatchNorm, no FC except input/output, ReLU in G, LeakyReLU in D—empirical recipe for stable conv GANs.

Question 9

9 WGAN / WGAN-GP? 🔥 hard

Answer

Answer: Use Wasserstein distance with Lipschitz critic (weight clip or gradient penalty)—smoother training signal than JS when distributions disjoint.

Question 10

10 LSGAN / hinge? 📊 medium

Answer

Answer: Replace sigmoid BCE with least-squares or hinge losses—often more stable gradients in practice.

Question 11

11 Conditional GAN? 📊 medium

Answer

Answer: Both G and D conditioned on label, text, or image—enables targeted generation (class-conditional faces, etc.).

Question 12

12 pix2pix? 📊 medium

Answer

Answer: Paired image-to-image with U-Net G and PatchGAN D—L1 + adversarial loss for aligned translation (maps→aerial).

Question 13

13 CycleGAN? 🔥 hard

Answer

Answer: Unpaired domains with cycle consistency L_cycle(G,F)—horse↔zebra without paired data.

Question 14

14 StyleGAN idea? 📊 medium

Answer

Answer: Map latent through learned affine transforms per layer to control coarse-to-fine style—high-quality face generation.

Question 15

15 FID / Inception Score? 📊 medium

Answer

Answer: IS measures classifiable diversity of G samples; FID compares Inception feature statistics to real—lower FID is better.

Question 16

16 Signs of bad training? ⚡ easy

Answer

Answer: D loss → 0 instantly, G loss frozen, identical outputs, exploding gradients—tune LR, label smoothing, TTUR.

Question 17

17 Spectral normalization? 🔥 hard

Answer

Answer: Constrain D’s Lipschitz constant per layer—alternative to WGAN-GP for stable critic.

Question 18

18 Data needs? ⚡ easy

Answer

Answer: Large diverse sets for photorealism; data augmentation and balancing classes help conditional GANs.

Question 19

19 GAN vs diffusion today? 📊 medium

Answer

Answer: Diffusion often wins on diversity and training stability for images; GANs still valued for fast sampling and some domains.

Question 20

20 Deepfakes concern? ⚡ easy

Answer

Answer: Misinformation and consent—watermarking, detection models, and policy; same tech powers legitimate VFX and data synthesis.

Related Computer Vision Links

GANs: 20 Essential Q&A

Quick Navigation

GAN Cheat Sheet

Players

Issues

Metrics

Full tutorial track