Generative AI 20 Essential Q/A
FAANG GenAI Prep

Generative AI: 20 Interview Questions

Master LLMs, GANs, VAEs, Diffusion Models, GPT, prompt engineering, RLHF, RAG, fine-tuning (LoRA), and evaluation metrics. Interview-ready answers with formulas.

LLM Diffusion GPT LoRA Prompt RAG
1 What is Generative AI? How different from discriminative models? ⚡ Easy
Answer: Generative AI models learn the joint probability distribution P(X, Y) or P(X) to generate new samples from the same distribution. Discriminative models learn decision boundary P(Y|X). Generative models create; discriminative models classify.
Generative: learn P(X). Discriminative: learn P(Y|X).
2 Major families of generative models? 📊 Medium
Answer: GANs (adversarial), VAEs (variational), Flow-based (invertible), Diffusion (denoising), Autoregressive (GPT), Energy-based, Hybrids. Each with different trade-offs in sample quality, likelihood, speed.
3 Explain GAN. Generator and discriminator loss? 🔥 Hard
Answer: GAN: Generator (G) creates fake samples; Discriminator (D) distinguishes real vs fake. Min-max game: D tries to maximize log(D(x)) + log(1-D(G(z))); G minimizes log(1-D(G(z))) (or -log(D(G(z))) for better gradients).
min_G max_D V(D,G) = E_x[log D(x)] + E_z[log(1 - D(G(z)))]
4 How does VAE work? ELBO explained. 🔥 Hard
Answer: VAE learns latent distribution p(z|x). Encoder outputs μ, σ. ELBO = reconstruction - KL(q(z|x)||p(z)). Maximizing ELBO ≈ minimizing KL divergence + maximizing likelihood. Reparameterization trick enables backprop through sampling.
5 How do denoising diffusion models work? 🔥 Hard
Answer: Forward: gradually add Gaussian noise to data over T steps (q(x_t|x_{t-1})). Reverse: learn to denoise (p_θ(x_{t-1}|x_t)). Trained with variational bound; simplified loss: predict added noise ε. Stable Diffusion adds text conditioning via cross-attention.
6 What are autoregressive generative models? 📊 Medium
Answer: Generate sequence token by token, conditioning on previous tokens. P(x) = ∏ P(x_t | x_<t). Examples: GPT, PixelCNN, WaveNet. Parallelization limited (inference sequential), but stable training.
7 What is prompt engineering? Common techniques? 📊 Medium
Answer: Designing input prompts to steer LLM output. Techniques: zero-shot, few-shot, chain-of-thought (CoT), role prompting, system prompts, instruction tuning, delimiter usage. Critical for production LLM applications.
8 Full fine-tuning vs parameter-efficient methods (PEFT)? 🔥 Hard
Answer: Full fine-tuning updates all weights (costly, catastrophic forgetting). PEFT: adapters, prefix tuning, LoRA (low-rank matrices). LoRA: W' = W + BA, freeze W, train BA. Saves memory, multi-task serving.
9 Explain LoRA mathematically. Why no inference latency? 🔥 Hard
Answer: LoRA: ΔW = BA (B∈R^{d×r}, A∈R^{r×k}), r << d. Update: h = Wx + BAx. Merge after training: W_merged = W + BA. No extra compute at inference. Reduces trainable params by ~10,000x.
10 How does RLHF work? (ChatGPT) 🔥 Hard
Answer: 1) Supervised fine-tuning (SFT). 2) Train reward model from human preference comparisons. 3) Optimize policy with PPO using reward model. Aligns models with human preferences, reduces harm, improves helpfulness.
11 What is RAG? When to use? 📊 Medium
Answer: RAG = retrieve relevant documents from external corpus, add to context, then generate. Reduces hallucination, handles knowledge updates without retraining, improves factual accuracy. Used in chatbots, QA systems.
12 Why do LLMs hallucinate? How to reduce? 🔥 Hard
Answer: Causes: model trained to be helpful (may guess), lack of knowledge, memorized misconceptions, sampling randomness. Mitigation: RAG, prompt engineering (cite sources), temperature reduction, reinforcement learning from feedback, controlled decoding.
13 Temperature scaling vs top-k vs nucleus (top-p) sampling? 📊 Medium
Answer: Temperature T: softmax(logits/T). T→0 greedy, T=1 standard, T>1 more random. Top-k: sample from k highest probability tokens. Top-p (nucleus): sample from smallest set with cumulative probability ≥ p. Often combined.
14 Metrics for evaluating generative models? 🔥 Hard
Answer: Image: Inception Score (IS), FID (Fréchet Inception Distance), Precision/Recall. Text: perplexity, BLEU, ROUGE, METEOR, BERTScore, human evaluation. LLM-as-a-judge (GPT-4). No single metric captures all.
15 What is mode collapse? Solutions? 🔥 Hard
Answer: Generator produces limited varieties (collapses to few modes). Solutions: WGAN (Wasserstein loss), gradient penalty, mini-batch discrimination, unrolled GANs, diverse batch sampling, regularization.
16 Architecture of Stable Diffusion? 🔥 Hard
Answer: 1) VAE encoder/decoder (compress to latent space). 2) U-Net with cross-attention (denoise). 3) Text encoder (CLIP/OpenCLIP). Diffusion in latent space (efficient). Conditioned on text embeddings. Classifier-free guidance.
17 What is classifier-free guidance (CFG)? 🔥 Hard
Answer: Diffusion model trained with and without condition (randomly drop). Sampling: ϵ = ϵ_cond + w(ϵ_cond - ϵ_uncond). w > 1 increases condition adherence (at cost of diversity). Common in DALL·E 2, Stable Diffusion.
18 What is VQ-VAE? Why used? 🔥 Hard
Answer: VQ-VAE uses discrete latent codes (learned codebook). Encoder output mapped to nearest embedding, decoder reconstructs. Avoids posterior collapse, good for high-fidelity generation. Used in VQ-GAN, DALL·E.
19 Chain-of-Thought prompting – how does it help? 📊 Medium
Answer: CoT encourages step-by-step reasoning before final answer. Improves performance on arithmetic, commonsense, symbolic reasoning tasks. Zero-shot CoT: "Let's think step by step". Emergent ability of large models.
20 What is Constitutional AI (Anthropic)? 🔥 Hard
Answer: Train models to critique and revise own outputs based on constitution principles. RLHF from AI feedback (RLAIF). Reduces need for human labeling, improves harmlessness, scalable oversight.

Generative AI – Interview Cheat Sheet

Model Families
  • GAN Adversarial, sharp images, mode collapse
  • Diffusion SOTA images, slow inference
  • VAE Stable, likelihood, blurry
  • Autoregressive GPT, sequential
LLM Techniques
  • Fine-tuning: Full vs LoRA/adapters
  • RLHF: Reward model + PPO
  • RAG: Retrieve + generate
  • Prompt: CoT, few-shot, system
Evaluation
  • Image FID, IS, Precision/Recall
  • Text Perplexity, BLEU, BERTScore
  • LLM GPT-4 as judge
Challenges
  • Hallucination (RAG, prompt)
  • Mode collapse (WGAN, diversity)
  • Posterior collapse (β-VAE, VQ)
  • Inference speed (distillation)

Verdict: "GANs and VAEs are foundations; Diffusion and LLMs dominate today. Know your fine-tuning and prompting."