Related Machine Learning Links
Learn Anomaly Detection Machine Learning Tutorial, validate concepts with Anomaly Detection Machine Learning MCQ Questions, and prepare interviews through Anomaly Detection Machine Learning Interview Questions and Answers.
Anomaly Detection
Anomaly detection identifies rare observations that deviate significantly from the majority of the data, such as fraud, network intrusions or faulty sensors.
Real-World Use Cases
- Credit card fraud detection.
- Network intrusion detection.
- Industrial equipment fault monitoring.
- Medical anomaly detection (rare diseases, unusual lab results).
Isolation Forest
Isolation Forest isolates anomalies by randomly partitioning the feature space; anomalies are easier to isolate and thus have shorter average path lengths in the trees.
from sklearn.ensemble import IsolationForest
iso = IsolationForest(
n_estimators=200,
contamination=0.02,
random_state=42
)
iso.fit(X_train)
scores = iso.decision_function(X_test)
labels = iso.predict(X_test) # -1 = anomaly, 1 = normal
One-Class SVM
One‑Class SVM learns a decision boundary around the "normal" class and flags points that lie outside this region as anomalies.
from sklearn.svm import OneClassSVM
ocsvm = OneClassSVM(kernel="rbf", gamma="scale", nu=0.05)
ocsvm.fit(X_train_normal)
pred = ocsvm.predict(X_test) # -1 anomaly, 1 normal
Evaluating Anomaly Detectors
Evaluation is tricky because anomalies are rare and labels may be incomplete.
- Use precision‑recall curves instead of accuracy for highly imbalanced data.
- Work closely with domain experts to validate flagged anomalies.
- Consider cost‑sensitive metrics (false negatives are often more expensive than false positives).