Related Machine Learning Links
Learn Concepts Machine Learning Tutorial, validate concepts with Concepts Machine Learning MCQ Questions, and prepare interviews through Concepts Machine Learning Interview Questions and Answers.
Machine Learning
Key Concepts
Must-know Terms
Core Machine Learning Concepts & Terminology
Before diving into algorithms, it’s critical to understand the common language of Machine Learning — datasets, features, labels, loss, overfitting, generalization and more.
Datasets, Features & Labels
A typical ML dataset can be represented as a matrix \(X\) (rows = samples, columns = features) and a target vector \(y\) (labels).
- Sample / Instance: a single row of data (e.g., one customer, one transaction).
- Feature: an input variable (age, income, number of clicks).
- Label / Target: what we want to predict (price, churn yes/no).
samples
features
labels
tabular data
Training, Validation & Test Sets
We split the dataset to estimate how well our model will perform on unseen data:
- Training set: used to fit the model parameters.
- Validation set: used for model selection and hyperparameter tuning.
- Test set: used once at the end to report final performance.
Rule of thumb: never use your test data to make modeling decisions. That leads to optimistic and unreliable metrics.
Overfitting vs Underfitting
Models must balance fit and simplicity:
- Underfitting: model is too simple, cannot capture the pattern (high bias).
- Overfitting: model memorizes noise in training data (high variance).
- Just right: low training error and low validation error.
Loss Functions & Evaluation Metrics
During training, we minimize a loss function. After training, we report metrics that are easier to interpret.
- Regression: MSE, RMSE, MAE, \(R^2\).
- Classification: Accuracy, Precision, Recall, F1‑Score, ROC‑AUC.