Recurrent Neural Networks deep dive 15 questions 25 min

RNN & LSTM MCQ · test your sequence modeling knowledge

From vanilla RNN to LSTM gates, GRU, and backpropagation through time – 15 questions covering recurrent architectures.

Easy: 5 Medium: 6 Hard: 4
RNN
LSTM
Gates
BPTT

Recurrent Neural Networks & LSTM: modeling sequences

Recurrent Neural Networks (RNNs) process sequences by maintaining a hidden state. Long Short‑Term Memory (LSTM) and Gated Recurrent Units (GRU) address the vanishing gradient problem via gating mechanisms. This MCQ test covers architecture, gates, BPTT, and practical variants.

Why recurrence?

Feedforward networks fail with variable-length sequences. RNNs share parameters across time steps, enabling them to model temporal dependencies.

RNN/LSTM glossary – key concepts

Vanilla RNN

h_t = tanh(W·[h_{t-1}, x_t] + b). Suffers from vanishing/exploding gradients over long sequences.

LSTM

Long Short‑Term Memory. Uses forget, input, output gates and a cell state to regulate information flow.

Forget gate

Decides what to discard from cell state: f_t = σ(W_f·[h_{t-1}, x_t] + b_f).

Input gate

Decides which new information to store: i_t = σ(W_i·[h_{t-1}, x_t] + b_i).

Output gate

Controls what to output from cell state: o_t = σ(W_o·[h_{t-1}, x_t] + b_o).

GRU

Gated Recurrent Unit – combines forget and input gates into update gate, and merges cell/hidden state. Fewer parameters than LSTM.

BPTT

Backpropagation Through Time – unrolls the RNN and applies standard backprop across time steps.

# LSTM gate equations (simplified)
f = sigmoid(Wf @ [h_prev, x] + bf)   # forget gate
i = sigmoid(Wi @ [h_prev, x] + bi)   # input gate
c_tilde = tanh(Wc @ [h_prev, x] + bc) # candidate cell
c = f * c_prev + i * c_tilde          # new cell state
o = sigmoid(Wo @ [h_prev, x] + bo)   # output gate
h = o * tanh(c)                       # new hidden state
Interview tip: Be ready to explain why LSTM/GRU mitigate vanishing gradients (additive cell state with forget gate allows constant error flow). Understand the role of each gate and compare LSTM vs GRU.

Common RNN/LSTM interview questions

  • What is the vanishing gradient problem in RNNs and how do LSTMs solve it?
  • Explain the purpose of the forget gate in LSTM.
  • How does GRU differ from LSTM?
  • What is backpropagation through time (BPTT)?
  • Why do we use tanh in the cell state update?
  • What are bidirectional RNNs?