NLP with Deep Learning 15 questions 25 min

NLP Deep Learning MCQ · test your knowledge

From word embeddings to Transformers – 15 questions covering RNNs, LSTMs, attention, BERT, and modern NLP.

Easy: 5 Medium: 6 Hard: 4
Word embeddings
RNN / LSTM
Attention
Transformers

Deep Learning for Natural Language Processing

Deep learning revolutionized NLP by enabling models to learn hierarchical representations of text. From word embeddings (Word2Vec, GloVe) to recurrent models (RNN, LSTM) and the Transformer architecture (BERT, GPT), this MCQ tests your understanding of how neural networks process language.

Why deep learning for NLP?

Traditional NLP relied on hand‑crafted features. Deep learning automatically learns useful representations, capturing syntax, semantics, and context directly from raw text.

NLP deep learning glossary – key concepts

Word embeddings

Dense vector representations of words (e.g., Word2Vec, GloVe, fastText) that capture semantic similarity.

RNN / LSTM / GRU

Recurrent architectures process sequences by maintaining a hidden state. LSTMs and GRUs mitigate vanishing gradients.

Attention mechanism

Allows the model to focus on relevant parts of the input when producing each output. Key component of Transformers.

Transformer

Architecture based solely on attention, enabling parallelization and capturing long‑range dependencies.

BERT (Bidirectional Encoder Representations from Transformers)

Pretrained Transformer encoder fine‑tuned for various NLP tasks. Uses masked language modeling.

GPT (Generative Pretrained Transformer)

Autoregressive Transformer decoder for text generation.

Seq2seq with attention

Encoder‑decoder framework for machine translation, summarization, etc.

# Self-attention in a nutshell (conceptual)
# Attention(Q,K,V) = softmax(QK^T / sqrt(d_k)) V
# Each token attends to all tokens, weighted by similarity.
Interview tip: Be prepared to explain the difference between RNNs and Transformers, how attention works, why positional encodings are needed, and what makes BERT bidirectional.

Common NLP deep learning interview questions

  • How do word embeddings capture semantic meaning?
  • What are the advantages of LSTMs over vanilla RNNs?
  • Explain the attention mechanism and its role in Transformers.
  • Why do Transformers use positional encodings?
  • What is the difference between BERT and GPT?
  • How does masked language modeling work in BERT pretraining?