Related Natural Language Processing Links
Learn Rouge Natural Language Processing Tutorial, validate concepts with Rouge Natural Language Processing MCQ Questions, and prepare interviews through Rouge Natural Language Processing Interview Questions and Answers.
ROUGE Score
Recall-Oriented Understudy for Gisting Evaluation.
ROUGE Score
While BLEU focuses on Precision (translation), ROUGE focuses on Recall. It is primarily used to evaluate Text Summarization.
Level 1 — The Variants
There are three main types of ROUGE:
- ROUGE-1: Overlap of individual words.
- ROUGE-2: Overlap of word pairs (bigrams).
- ROUGE-L: Based on the Longest Common Subsequence (captures sentence structure better).
Level 2 — Precision vs Recall
A summary needs to capture the "gist" of the original. High ROUGE recall means the model caught all the important points. High ROUGE precision means the model didn't include unnecessary "fluff".
Level 3 — Use Cases
ROUGE is crucial for training LLMs on summarization tasks. However, it can be "gamed" by repeating words, so it's often used alongside BERTScore for better semantic matching.
# pip install rouge
from rouge import Rouge
rouge = Rouge()
hypothesis = "I like natural language processing."
reference = "I love natural language processing."
scores = rouge.get_scores(hypothesis, reference)
print(scores[0]['rouge-l'])