Technology Hub
AI Hub
Deep Learning
Lossfun

Loss functions MCQ Â· test your deep learning knowledge

From MSE to crossâ€‘entropy, hinge, Huber and KL divergence â€“ 15 questions covering regression, classification & robust losses.

Easy: 5 Medium: 6 Hard: 4

MSE / MAE

Crossâ€‘entropy

Hinge loss

KL divergence

Your Lossâ€‘function score

0/15

0 Correct0 Incorrect

Loss functions: the compass of neural networks

Loss functions (or cost functions) quantify the difference between predicted and target values. They guide optimisation algorithms to update model parameters. This MCQ covers the most essential loss functions in deep learning, from classic regression losses to modern classification objectives.

Why loss functions matter

The choice of loss function directly impacts what the model learns. For regression, MSE penalises large errors more; for classification, crossâ€‘entropy measures the dissimilarity between true and predicted distributions. Robust losses like Huber reduce sensitivity to outliers.

Core concepts tested

MSE & MAE

Mean Squared Error (L2) : (y - Å·)Â² â€“ sensitive to outliers. Mean Absolute Error (L1) : |y - Å·| â€“ more robust.

Crossâ€‘entropy families

Binary crossâ€‘entropy for binary classification with sigmoid; Categorical crossâ€‘entropy for multiâ€‘class with softmax.

Hinge loss

Used in SVMs and some neural nets for maximumâ€‘margin classification. Typically with 'max-margin' and squareâ€‘hinge variants.

Kullbackâ€‘Leibler divergence

Measures how one probability distribution diverges from another. Often used in VAEs and generative models.

Huber loss

Combines MSE and MAE: quadratic for small errors, linear for large ones. Less sensitive to outliers.

Loss + activation

The output layer activation must match the loss: e.g., sigmoid + binary crossâ€‘entropy, softmax + categorical crossâ€‘entropy.

# Binary crossâ€‘entropy (Python oneâ€‘liner)
bce = - ( y_true * log(y_pred) + (1 - y_true) * log(1 - y_pred) )

Interview tip: Understand when to use MSE vs MAE, why crossâ€‘entropy is preferred over MSE for classification, and what "label smoothing" means.

Common interview questions on loss functions

Why is MSE not ideal for binary classification with sigmoid?
Explain the difference between categorical crossâ€‘entropy and sparse categorical crossâ€‘entropy.
What problem does the Huber loss address?
What is the derivative of the hinge loss?
When would you use KL divergence over crossâ€‘entropy?

AI Hub Next: Backpropagation