GANs: 20 Interview Questions

Question 1

1 What is a Generative Adversarial Network (GAN)? Explain the core idea. âš¡ Easy

Answer

Answer: GANs consist of two networks: a generator (G) that creates fake data from noise, and a discriminator (D) that tries to distinguish real from fake. They play a minimax game: G tries to fool D, D tries to not be fooled. The equilibrium is when G produces realistic data (D=0.5).

Question 2

2 Describe the roles of generator and discriminator in detail. ðŸ“Š Medium

Answer

Answer: Generator maps latent vector z (random noise) to data space, trying to produce realistic samples. Discriminator is a binary classifier that outputs probability of input being real. They are trained alternately: D on real + fake, G to maximize D's error.

Question 3

3 What is the difference between minimax loss and non-saturating loss in GANs? ðŸ”¥ Hard

Answer

Answer: Minimax: G minimizes log(1-D(G(z))) â†’ early vanishing gradient. Non-saturating: G maximizes log(D(G(z))) â†’ stronger gradients early. Modern GANs use non-saturating loss with improved training.

Question 4

4 What is mode collapse in GANs? Why does it happen? ðŸ”¥ Hard

Answer

Answer: Mode collapse occurs when generator produces limited varieties (only few modes of data distribution). It happens when G finds a few "tricks" that fool D and over-optimizes them, failing to explore full distribution.

Question 5

5 What are the main contributions of DCGAN? ðŸ“Š Medium

Answer

Answer: DCGAN (Deep Convolutional GAN) introduced: 1) Replace pooling with strided convolutions (D) / fractional-strided (G). 2) BatchNorm in both G and D. 3) No fully connected layers. 4) ReLU in G (except output tanh), LeakyReLU in D. Stabilized training.

Question 6

6 How does WGAN improve GAN training? ðŸ”¥ Hard

Answer

Answer: WGAN replaces JSD with Earth-Mover (Wasserstein) distance, which is continuous and provides meaningful gradients even when D is perfect. Uses weight clipping (later gradient penalty WGAN-GP) for Lipschitz constraint. Solves mode collapse and training instability.

Question 7

7 What is a conditional GAN? Where is it used? ðŸ“Š Medium

Answer

Answer: cGAN feeds additional condition (class label, text, image) to both generator and discriminator. Enables controlled generation. Applications: Pix2Pix, text-to-image synthesis, semantic segmentation.

Question 8

8 How does CycleGAN perform unpaired image translation? ðŸ”¥ Hard

Answer

Answer: CycleGAN uses two generators (G: Xâ†’Y, F: Yâ†’X) and two discriminators. Key: cycle-consistency loss â€“ translating Xâ†’Yâ†’X should return original. No paired data needed. Also identity loss to preserve color.

Question 9

9 What is latent space in GANs? Why interpolation is smooth? ðŸ“Š Medium

Answer

Answer: Latent space (z) is low-dimensional input to generator, typically Gaussian. G learns to map continuous z to realistic images; interpolating between z vectors yields semantically smooth transitions, showing G has learned meaningful representations.

Question 10

10 What is unique about StyleGAN architecture? ðŸ”¥ Hard

Answer

Answer: StyleGAN removes input latent vector; instead uses mapping network to intermediate latent space w, then AdaIN (adaptive instance normalization) controls style at each layer. Also adds noise for stochastic variations. Enables disentangled control (coarse/fine styles).

Question 11

11 How are GANs evaluated? Explain FID and Inception Score. ðŸ”¥ Hard

Answer

Answer: Inception Score (IS): uses pretrained InceptionNet; measures image quality and diversity (high score if confident class predictions & varied labels). Frechet Inception Distance (FID): computes Wasserstein-2 distance between real & fake feature distributions; lower is better, more robust than IS.

Question 12

12 Why do GANs suffer from vanishing gradients? ðŸ“Š Medium

Answer

Answer: When D becomes too strong (perfectly classifies), log(1-D(G(z))) saturates to 0, giving G almost no gradient. Solutions: non-saturating loss, WGAN (critic scores not probabilities), label smoothing, or making D weaker.

Question 13

13 What is one-sided label smoothing? Why only for real labels? ðŸ“Š Medium

Answer

Answer: Replace real labels (1) with soft values like 0.9. Prevents D from becoming overconfident, providing smoother gradients. Only smooth real labels; smoothing fake labels (0â†’0.1) encourages D to push G samples away, harming training.

Question 14

14 Compare GANs and VAEs. ðŸ“Š Medium

Question 15

15 Why is weight clipping problematic in WGAN? How is it fixed? ðŸ”¥ Hard

Answer

Answer: Weight clipping forces critic to lie in narrow space, leading to capacity underuse and exploding/vanishing gradients. WGAN-GP replaces it with gradient penalty: penalize if gradient norm deviates from 1 (Lipschitz constraint).

Question 16

16 What is spectral normalization in GANs? ðŸ”¥ Hard

Answer

Answer: Normalizes weights by their largest singular value, enforcing Lipschitz constraint (spectral norm = 1). Used in SNGAN; stabilizes training without heavy hyperparameter tuning. Works well for both G and D.

Question 17

17 Why introduce attention in GANs? ðŸ“Š Medium

Answer

Answer: SAGAN uses self-attention to model long-range dependencies (global features) instead of only local convolutions. Improves image quality in complex scenes (e.g., ImageNet) by capturing relationships between distant regions.

Question 18

18 Explain feature matching technique in GANs. ðŸ”¥ Hard

Answer

Answer: G is trained to match the expected features (intermediate activations) of real data from D, not just final D output. Minimizes L2 distance between real/fake feature means. Helps prevent overtraining on current D.

Question 19

19 What is progressive growing in GANs? ðŸ”¥ Hard

Answer

Answer: Start training with low-resolution images, gradually add layers to increase resolution. Stabilizes high-resolution GAN training (e.g., 1024x1024). Both G and D grow simultaneously. Used in StyleGAN, ProGAN.

Question 20

20 What is Nash equilibrium in context of GANs? Do we achieve it? ðŸ”¥ Hard

Answer

Answer: Nash equilibrium: D is optimal (cannot distinguish real/fake) and G is optimal (data distribution = real distribution). In practice, GANs oscillate and rarely converge to exact equilibrium; we aim for approximate Nash. Techniques like consensus optimization try to find stable points.

Related Deep Learning Links

GANs: 20 Interview Questions

GANs â€“ Interview Cheat Sheet

Generator

Discriminator

Advanced GANs

Metrics