Related Data Science Links
Learn Knn Data Science Tutorial, validate concepts with Knn Data Science MCQ Questions, and prepare interviews through Knn Data Science Interview Questions and Answers.
KNN
Classification
Simple & Intuitive
scikit-learn
K-Nearest Neighbors (KNN)
Learn how KNN classifies a new point based on the labels of its nearest neighbors, with a short theory and Python code example.
What is KNN?
K-Nearest Neighbors is a lazy learning algorithm: it stores the training data and makes predictions only when asked.
- For a new point, it finds the K closest training points (neighbors).
- For classification, it takes a majority vote of their labels.
- Distance is usually Euclidean distance for numeric features.
Example: KNeighborsClassifier
KNN on Iris Dataset
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
# Scale features for distance-based algorithms
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
knn = KNeighborsClassifier(
n_neighbors=5, # K value
metric="minkowski",
p=2 # p=2 => Euclidean distance
)
knn.fit(X_train_scaled, y_train)
y_pred = knn.predict(X_test_scaled)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nReport:\n", classification_report(y_test, y_pred, target_names=iris.target_names))