Related Data Science Links
Learn Association Data Science Tutorial, validate concepts with Association Data Science MCQ Questions, and prepare interviews through Association Data Science Interview Questions and Answers.
Association Rules
Market Basket
Support & Confidence
Python (mlxtend)
Association Rule Mining
Learn how to discover interesting relationships between items, such as products often bought together, using association rules.
Key Concepts
- Support: how frequently an itemset appears in the dataset.
- Confidence: how often rule
A ⇒ Bis true when A occurs. - Lift: how much more often A and B occur together than expected if they were independent.
Example: Apriori with mlxtend
We use a small transaction dataset, convert it to one-hot encoded format, find frequent itemsets with Apriori, then generate association rules.
Frequent Itemsets & Rules
# pip install mlxtend
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
transactions = [
["bread", "milk"],
["bread", "diaper", "beer", "eggs"],
["milk", "diaper", "beer", "coke"],
["bread", "milk", "diaper", "beer"],
["bread", "milk", "diaper", "coke"],
]
# Convert to one-hot encoded DataFrame
unique_items = sorted({item for basket in transactions for item in basket})
one_hot = []
for basket in transactions:
row = {item: (item in basket) for item in unique_items}
one_hot.append(row)
df = pd.DataFrame(one_hot)
print("One-hot encoded transactions:")
print(df)
# Find frequent itemsets
freq_itemsets = apriori(
df,
min_support=0.4, # appears in at least 40% of baskets
use_colnames=True
)
print("\nFrequent itemsets:")
print(freq_itemsets)
# Generate association rules
rules = association_rules(
freq_itemsets,
metric="confidence",
min_threshold=0.6
)
print("\nAssociation Rules:")
print(rules[["antecedents", "consequents", "support", "confidence", "lift"]])