Related Natural Language Processing Links
Learn Rnn Natural Language Processing Tutorial, validate concepts with Rnn Natural Language Processing MCQ Questions, and prepare interviews through Rnn Natural Language Processing Interview Questions and Answers.
RNN for NLP
Explore vanilla recurrent networks and bidirectional variants.
Recurrent Neural Networks (RNN)
Recurrent Neural Networks (RNNs) are a class of artificial neural networks designed specifically for processing sequential data, such as time series, audio, or natural language text.
How RNNs Differ from Feed-Forward Nets
Feed Forward (Standard DNN)
A standard neural network processes the entire input all at once and has a fixed input size. It has no concept of "memory" or order.
Recurrent (RNN)
RNNs process sequences step-by-step. They maintain a Hidden State (a memory vector) that gets continually updated as it reads through the sentence word by word.
Level 1 — Building an RNN in Keras
RNNs excel at sequence classification (like sentiment analysis). Here, we process words sequentially to determine if a review is positive or negative.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense, Embedding
vocab_size = 10000
embedding_dim = 32
max_sequence_length = 100
model = Sequential([
# Turn positive integers (word indices) into dense vectors of fixed size
Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_sequence_length),
# Vanilla RNN layer: maintains a 64-dimension hidden state across time steps
SimpleRNN(64, return_sequences=False),
# Binary classification output layer
Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
Level 2 — Bidirectional RNNs
A standard RNN only knows what happened before the current word. A Bidirectional RNN processes the sentence going forwards AND backwards simultaneously, concatenating the hidden states. This provides full context of the whole sentence!
from tensorflow.keras.layers import Bidirectional
bi_model = Sequential([
Embedding(input_dim=vocab_size, output_dim=embedding_dim),
# By wrapping the RNN in Bidirectional, Keras automatically handles
# the forward and backward passes and combines them.
Bidirectional(SimpleRNN(64)),
Dense(1, activation='sigmoid')
])
The Core Flaw: The Vanishing Gradient Problem
Why don't we use purely Vanilla RNNs? As an RNN processes a very long sequence (like a paragraph), during backpropagation, gradients are multiplied many times. Values smaller than 1 quickly disappear to 0 (Vanishing Gradient). Result: A Vanilla RNN suffers from short-term memory and forgets the beginning of a sentence by the time it reaches the end. This led to the creation of LSTMs!