Neural Networks 15 Essential Q&A
Interview Prep

Network Design & Depth — 15 Interview Questions

How deep vs how wide, bottlenecks, skip connections, receptive fields, and matching architecture to data—without hand-wavy “bigger is better.”

Colored left borders per card; green / amber / red difficulty chips.

Depth Width Bottleneck Inductive bias
1 What does “network design” mean in an interview?Easy
Answer: Choosing depth, width, connectivity patterns (residual, dense), input/output heads, and regularization hooks so the model has enough capacity but fits data and compute.
2 Depth vs width—trade-offs.Medium
Answer: Depth composes features hierarchically; can improve sample efficiency for structured tasks. Width increases representational power per layer. Very deep nets need care (residuals, normalization); very wide nets cost parameters and memory.
3 What is model capacity?Easy
Answer: Roughly the family of functions the architecture can represent (VC-style intuition or parameter count as proxy). High capacity can overfit small data; too low underfits.
4 What is a bottleneck layer?Medium
Answer: A layer with fewer units than neighbors, forcing compression of the representation—used in autoencoders, Inception modules, some efficient conv blocks (1×1 convs).
5 Why do skip (residual) connections help very deep nets?Medium
Answer: They provide gradient highways and make it easier to learn near-identity refinements (“residual mapping”). Mitigates degradation and vanishing signal in deep stacks.
6 What is inductive bias?Medium
Answer: Prior assumptions baked into the architecture—e.g. CNNs assume locality and translation structure; RNNs assume sequential dependence. Good bias improves data efficiency.
7 Receptive field—why does it matter for CNN design?Medium
Answer: The region of input affecting one output neuron. Must grow large enough to capture context (objects, text n-grams in 1D CNNs)—deeper stacks, dilated convs, or pooling increase effective RF.
8 Parameters vs FLOPs—both needed?Easy
Answer: Parameters drive memory and overfitting risk; FLOPs drive latency and training cost. A layer can be compute-heavy but parameter-light (depthwise separable convs) or the opposite.
9 Signs your network is too small (underfitting).Easy
Answer: Training loss stays high; both train and validation error poor. Fix: more layers/units, better features, or longer training if optimization was the issue.
10 Signs your network is too large (overfitting).Easy
Answer: Training loss low but validation much worse. Fix: regularization, data, smaller model, early stopping—not always “more parameters.”
11 “Scaling laws” in one interview sentence.Hard
Answer: Empirically, loss often improves predictably with more parameters, data, and compute along Pareto fronts—guides large-model training but doesn’t replace task-specific design.
12 Multi-branch architectures (e.g. Inception idea).Hard
Answer: Parallel paths with different kernel sizes or operations capture multi-scale features; concatenation or addition fuses them—richer than a single tower at cost of complexity.
13 How does input resolution affect design?Medium
Answer: Higher resolution increases spatial tokens and compute (often quadratically for attention, linearly depth-wise for conv). May need deeper nets or downsampling early to control cost.
14 When start from a pretrained architecture?Medium
Answer: Small data or similar domain to pretraining—reuse backbone, replace classifier head. Random init better when data is huge or domain mismatch is extreme (with caveats).
15 Practical order for picking depth and width.Medium
Answer: Start from a known baseline (ResNet-18, small Transformer), match parameter budget to GPU and dataset size, measure train vs val curves, then adjust depth/width/regularization—not guess huge first.
Mention train/val gap and compute budget—signals you design empirically, not only from theory.

Quick review checklist

  • Depth vs width; capacity; bottlenecks; residual paths.
  • Inductive bias; receptive field; params vs FLOPs.
  • Underfit vs overfit signals; scaling and pretrained backbones.