NLP Interview: 20 Essential Q&A

Question 1

1 What is tokenization in NLP? âš¡ easy

Answer

Answer: Tokenization splits text into smaller units (tokens) â€“ words, subwords, or characters. Example: "NLP is fun" â†’ ["NLP", "is", "fun"]. Essential for preprocessing.

Question 2

2 Difference between stemming and lemmatization? âš¡ easy

Answer

Answer: Stemming chops off affixes (rule-based, e.g., "running" â†’ "run"). Lemmatization uses vocabulary/dictionary to return base form ("better" â†’ "good"). Lemmatization is more accurate.

Question 3

3 What is a stop word? âš¡ easy

Answer

Answer: Common words (the, is, at) that are often removed because they carry little semantic meaning. But context matters â€“ not always removed.

Question 4

4 Explain Bag-of-Words (BoW) model. ðŸ“Š medium

Answer

Answer: BoW represents text as a multiset of words, ignoring order and grammar. Creates a sparse vector of word counts. Simple but loses context.

Question 5

5 What is TF-IDF? ðŸ“Š medium

Answer

Answer: Term Frequencyâ€“Inverse Document Frequency. Weights words by how often they appear in a document (TF) and how rare across corpus (IDF). Highlights important words.

Question 6

6 Word embeddings vs one-hot encoding? ðŸ“Š medium

Answer

Answer: One-hot creates large sparse vectors with no similarity. Embeddings (word2vec, GloVe) are dense, low-dimensional, and capture semantic relationships (king - man + woman â‰ˆ queen).

Question 7

7 What is Word2Vec and its two architectures? ðŸ“Š medium

Answer

Answer: Word2Vec predicts word embeddings. CBOW predicts target from context; Skip-gram predicts context from target. Skip-gram works better for rare words.

Question 8

8 Define perplexity in language models. ðŸ”¥ hard

Answer

Answer: Perplexity measures how well a probability model predicts a sample. Lower perplexity means better generalization. It's 2^(cross-entropy).

Question 9

9 What is an N-gram model? ðŸ“Š medium

Answer

Answer: An N-gram predicts next word using previous N-1 words. Unigram (1), bigram (2), trigram (3). Simple but suffers from sparsity.

Question 10

10 Name common POS tagging algorithms. ðŸ“Š medium

Answer

Answer: Hidden Markov Models (HMM), Conditional Random Fields (CRF), and deep learning (BiLSTM + CRF, transformer-based).

Question 11

11 What is Named Entity Recognition (NER)? âš¡ easy

Answer

Answer: NER locates and classifies entities (person, org, location) in text. e.g., "Apple" as ORG.

Question 12

12 Explain the attention mechanism. ðŸ”¥ hard

Answer

Answer: Attention allows model to focus on relevant parts of input when producing output. Computes weighted sum of values based on query-key similarity. Self-attention = attention within same sequence.

Question 13

13 Transformer architecture in one sentence? ðŸ”¥ hard

Answer

Answer: The Transformer uses multi-head self-attention and feedforward layers, no recurrence, enabling parallelization and long-range dependencies.

Question 14

14 How does BERT differ from GPT? ðŸ”¥ hard

Answer

Answer: BERT is encoder-only, bidirectional (masked LM), excels at understanding tasks (classification, QA). GPT is decoder-only, autoregressive (left-to-right), designed for generation.

Question 15

15 What is the purpose of masked language modeling? ðŸ“Š medium

Answer

Answer: MLM (used in BERT) masks random tokens and trains model to predict them, forcing bidirectional context. Great for learning deep representations.

Question 16

16 What is sentiment analysis? âš¡ easy

Answer

Answer: Classifying text polarity (positive, negative, neutral). Often uses LSTMs, transformers, or lexicon-based methods.

Question 17

17 Explain beam search in text generation. ðŸ”¥ hard

Answer

Answer: Beam search keeps top-k hypotheses at each step, reducing risk of missing high-probability sequences. k is beam width. Balances diversity and optimality.

Question 18

18 What is BLEU score? ðŸ“Š medium

Answer

Answer: BLEU (bilingual evaluation understudy) measures n-gram overlap between generated and reference text. Common for translation, summarization. Ranges 0â€“1.

Question 19

19 Coreference resolution â€“ define. ðŸ”¥ hard

Answer

Answer: Identifying when two expressions refer to the same entity. e.g., "John said he would come" â†’ "John" and "he" corefer.

Question 20

20 What are some challenges in NLP? ðŸ“Š medium

Answer

Answer: Ambiguity, context, sarcasm, low-resource languages, bias in models, and commonsense reasoning. Still active research.

Related Natural Language Processing Links

NLP Interview: 20 Essential Q&A

NLP interview cheat sheet