Related Data Science Links
Learn Seaborn Data Science Tutorial, validate concepts with Seaborn Data Science MCQ Questions, and prepare interviews through Seaborn Data Science Interview Questions and Answers.
Seaborn for Data Visualization
Learn how to use Seaborn to create beautiful and statistical visualizations in Python, including scatter plots, histograms, boxplots, and more using simple examples and comments.
What is Seaborn?
Seaborn is a high-level data visualization library built on top of Matplotlib. It provides a simple interface to create attractive and informative statistical graphics, especially when working with Pandas DataFrames.
Key Advantages of Seaborn
- Beautiful default styles with minimal code.
- Deep integration with Pandas DataFrames.
- Built-in datasets for quick experimentation.
- Easy support for colors, groups, and statistical summaries.
Installation & Setup
Install Seaborn and set a clean plotting style.
# Install (run in terminal, not in Python)
pip install seaborn
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Set a nice default style for all Seaborn plots
sns.set(style="whitegrid")
Example 1: Scatter Plot
Scatter plots show the relationship between two numerical variables. Seaborn makes it easy to add color for different groups.
Total Bill vs Tip
import seaborn as sns
import matplotlib.pyplot as plt
# Load example dataset that ships with Seaborn
tips = sns.load_dataset("tips") # restaurant bills and tips
# Create a scatter plot: total_bill vs tip
sns.scatterplot(
data=tips,
x="total_bill", # column for x-axis
y="tip", # column for y-axis
hue="sex", # color points by 'sex' column
style="time", # marker style by lunch/dinner
size="size" # marker size by table size
)
plt.title("Total Bill vs Tip by Sex and Time")
plt.xlabel("Total Bill ($)")
plt.ylabel("Tip ($)")
plt.show()
hue, style, and size to encode additional
information into a single scatter plot.
Example 2: Distribution Plot
Distribution plots (histograms with optional KDE curves) help you understand how a variable is spread.
Distribution of Total Bill
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
# Plot the distribution of total_bill
sns.histplot(
data=tips,
x="total_bill", # numeric column to plot
bins=20, # number of histogram bars
kde=True, # show a smooth density curve
color="teal"
)
plt.title("Distribution of Total Bill")
plt.xlabel("Total Bill ($)")
plt.ylabel("Count")
plt.show()
Example 3: Boxplot
Boxplots summarize the distribution of a numeric variable and highlight potential outliers. They are great for comparing categories.
Total Bill by Day
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
# Boxplot of total_bill for each day of the week
sns.boxplot(
data=tips,
x="day", # categories on x-axis
y="total_bill", # numeric variable on y-axis
hue="sex" # split boxes by sex
)
plt.title("Total Bill by Day and Sex")
plt.xlabel("Day of Week")
plt.ylabel("Total Bill ($)")
plt.show()
Example 4: Correlation Heatmap
Heatmaps are useful to visualize correlations between multiple numeric variables.
Correlation Matrix
import seaborn as sns
import matplotlib.pyplot as plt
tips = sns.load_dataset("tips")
# Compute correlation matrix for numeric columns
corr = tips.corr(numeric_only=True)
# Plot correlation heatmap
sns.heatmap(
corr,
annot=True, # show correlation values
cmap="coolwarm", # color map
fmt=".2f", # number format
square=True
)
plt.title("Correlation Heatmap (Tips Dataset)")
plt.show()