Related Natural Language Processing Links
Learn Language Models Natural Language Processing Tutorial, validate concepts with Language Models Natural Language Processing MCQ Questions, and prepare interviews through Language Models Natural Language Processing Interview Questions and Answers.
Language Models
Understand the core concept of Language Modeling: assigning probabilities to sequences of words.
What is a Language Model?
At its absolute core, a Language Model (LM) does one simple mathematical thing: it assigns probabilities to sequences of words. It determines how "likely" a specific sentence is to exist in a given language.
High Probability (Valid English)
Low Probability (Gibberish)
The Goal: Next Word Prediction
Because of the chain rule of probability, assigning probabilities to full sentences is analytically identical to the task of Next Word Prediction (Autoregressive task).
A good language model will assign a high probability to words like "latte" or "cappuccino", and a near-zero probability to words like "car" or "elephant".
The Evolution of Language Models
| Era | Model Type | How it predicts the next word |
|---|---|---|
| 1990s | Statistical N-gram Models | Counts frequency of (n-1) previous words matching in a database table. Extremely limited memory. |
| 2010s | Recurrent Neural Nets (RNNs) | Passes a "hidden state" vector left-to-right through a neural network. Can remember long-term context, but suffers from vanishing gradients. |
| 2018 - Present | Transformer LLMs (GPT) | Uses "Self-Attention" to look at every word in the sentence simultaneously. Capable of trillions of parameters. Human-like reasoning capabilities. |