Related Data Science Links
Learn Dbscan Data Science Tutorial, validate concepts with Dbscan Data Science MCQ Questions, and prepare interviews through Dbscan Data Science Interview Questions and Answers.
DBSCAN
Density-Based
Noise Detection
scikit-learn
DBSCAN Clustering
Learn how DBSCAN groups dense regions of points and marks outliers as noise, useful for arbitrarily shaped clusters.
What is DBSCAN?
DBSCAN stands for Density-Based Spatial Clustering of Applications with Noise.
- Clusters are areas with high point density.
- Outliers in low-density regions are labeled as noise.
- Parameters:
eps: neighborhood radius.min_samples: minimum points to form a dense region.
Example: DBSCAN with Noisy Data
DBSCAN in scikit-learn
from sklearn.cluster import DBSCAN
from sklearn.datasets import make_moons
import matplotlib.pyplot as plt
# Generate non-linearly separable data (two moons)
X, y_true = make_moons(
n_samples=300,
noise=0.05,
random_state=42
)
dbscan = DBSCAN(
eps=0.2, # neighborhood radius
min_samples=5 # minimum points to form a cluster
)
labels = dbscan.fit_predict(X)
# -1 label is noise
plt.figure(figsize=(8, 6))
scatter = plt.scatter(
X[:, 0], X[:, 1],
c=labels,
cmap="viridis",
alpha=0.8
)
plt.title("DBSCAN Clustering (Two Moons)")
plt.show()