Association Rules Market Basket
Support & Confidence Python (mlxtend)

Association Rule Mining

Learn how to discover interesting relationships between items, such as products often bought together, using association rules.

Key Concepts

  • Support: how frequently an itemset appears in the dataset.
  • Confidence: how often rule A ⇒ B is true when A occurs.
  • Lift: how much more often A and B occur together than expected if they were independent.

Example: Apriori with mlxtend

We use a small transaction dataset, convert it to one-hot encoded format, find frequent itemsets with Apriori, then generate association rules.

Frequent Itemsets & Rules
# pip install mlxtend

import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

transactions = [
    ["bread", "milk"],
    ["bread", "diaper", "beer", "eggs"],
    ["milk", "diaper", "beer", "coke"],
    ["bread", "milk", "diaper", "beer"],
    ["bread", "milk", "diaper", "coke"],
]

# Convert to one-hot encoded DataFrame
unique_items = sorted({item for basket in transactions for item in basket})
one_hot = []
for basket in transactions:
    row = {item: (item in basket) for item in unique_items}
    one_hot.append(row)

df = pd.DataFrame(one_hot)

print("One-hot encoded transactions:")
print(df)

# Find frequent itemsets
freq_itemsets = apriori(
    df,
    min_support=0.4,   # appears in at least 40% of baskets
    use_colnames=True
)
print("\nFrequent itemsets:")
print(freq_itemsets)

# Generate association rules
rules = association_rules(
    freq_itemsets,
    metric="confidence",
    min_threshold=0.6
)

print("\nAssociation Rules:")
print(rules[["antecedents", "consequents", "support", "confidence", "lift"]])