Related Natural Language Processing Links
Learn Nlp Natural Language Processing Tutorial, validate concepts with Nlp Natural Language Processing MCQ Questions, and prepare interviews through Nlp Natural Language Processing Interview Questions and Answers.
NLP Interview
20 short answers
essential prep
NLP Interview: 20 Essential Q&A
Master Natural Language Processing fundamentals, from preprocessing to transformers. Short, crisp answers for interview success.
tokenization
word2vec
BERT
transformers
sentiment
1
What is tokenization in NLP?
âš¡ easy
Answer: Tokenization splits text into smaller units (tokens) – words, subwords, or characters. Example: "NLP is fun" → ["NLP", "is", "fun"]. Essential for preprocessing.
from nltk.tokenize import word_tokenize; word_tokenize("NLP rocks!")
preprocessing
splitting
2
Difference between stemming and lemmatization?
âš¡ easy
Answer: Stemming chops off affixes (rule-based, e.g., "running" → "run"). Lemmatization uses vocabulary/dictionary to return base form ("better" → "good"). Lemmatization is more accurate.
3
What is a stop word?
âš¡ easy
Answer: Common words (the, is, at) that are often removed because they carry little semantic meaning. But context matters – not always removed.
4
Explain Bag-of-Words (BoW) model.
📊 medium
Answer: BoW represents text as a multiset of words, ignoring order and grammar. Creates a sparse vector of word counts. Simple but loses context.
5
What is TF-IDF?
📊 medium
Answer: Term Frequency–Inverse Document Frequency. Weights words by how often they appear in a document (TF) and how rare across corpus (IDF). Highlights important words.
TF = (term count)/total terms
IDF = log(N/df)
6
Word embeddings vs one-hot encoding?
📊 medium
Answer: One-hot creates large sparse vectors with no similarity. Embeddings (word2vec, GloVe) are dense, low-dimensional, and capture semantic relationships (king - man + woman ≈ queen).
7
What is Word2Vec and its two architectures?
📊 medium
Answer: Word2Vec predicts word embeddings. CBOW predicts target from context; Skip-gram predicts context from target. Skip-gram works better for rare words.
8
Define perplexity in language models.
🔥 hard
Answer: Perplexity measures how well a probability model predicts a sample. Lower perplexity means better generalization. It's 2^(cross-entropy).
9
What is an N-gram model?
📊 medium
Answer: An N-gram predicts next word using previous N-1 words. Unigram (1), bigram (2), trigram (3). Simple but suffers from sparsity.
10
Name common POS tagging algorithms.
📊 medium
Answer: Hidden Markov Models (HMM), Conditional Random Fields (CRF), and deep learning (BiLSTM + CRF, transformer-based).
11
What is Named Entity Recognition (NER)?
âš¡ easy
Answer: NER locates and classifies entities (person, org, location) in text. e.g., "Apple" as ORG.
12
Explain the attention mechanism.
🔥 hard
Answer: Attention allows model to focus on relevant parts of input when producing output. Computes weighted sum of values based on query-key similarity. Self-attention = attention within same sequence.
Attention(Q,K,V) = softmax(QK^T/√d) V
13
Transformer architecture in one sentence?
🔥 hard
Answer: The Transformer uses multi-head self-attention and feedforward layers, no recurrence, enabling parallelization and long-range dependencies.
14
How does BERT differ from GPT?
🔥 hard
Answer: BERT is encoder-only, bidirectional (masked LM), excels at understanding tasks (classification, QA). GPT is decoder-only, autoregressive (left-to-right), designed for generation.
15
What is the purpose of masked language modeling?
📊 medium
Answer: MLM (used in BERT) masks random tokens and trains model to predict them, forcing bidirectional context. Great for learning deep representations.
16
What is sentiment analysis?
âš¡ easy
Answer: Classifying text polarity (positive, negative, neutral). Often uses LSTMs, transformers, or lexicon-based methods.
17
Explain beam search in text generation.
🔥 hard
Answer: Beam search keeps top-k hypotheses at each step, reducing risk of missing high-probability sequences. k is beam width. Balances diversity and optimality.
18
What is BLEU score?
📊 medium
Answer: BLEU (bilingual evaluation understudy) measures n-gram overlap between generated and reference text. Common for translation, summarization. Ranges 0–1.
19
Coreference resolution – define.
🔥 hard
Answer: Identifying when two expressions refer to the same entity. e.g., "John said he would come" → "John" and "he" corefer.
20
What are some challenges in NLP?
📊 medium
Answer: Ambiguity, context, sarcasm, low-resource languages, bias in models, and commonsense reasoning. Still active research.
🔄 ambiguity
🧠commonsense
🌠low-resource
NLP interview cheat sheet
- Tokenization / stemming / lemmatization
- BoW, TF-IDF, embeddings
- RNN / LSTM / Attention
- Transformers (BERT, GPT)
- NER, POS, sentiment
- Evaluation: BLEU, perplexity
Pro tip: Understand trade-offs between classical and deep learning approaches in NLP.
20/20 NLP questions
Text Processing