Linear Regression Q&A 20 Core Questions
Interview Prep

Linear Regression: Interview Q&A

Short questions and answers on how linear regression works, its assumptions, and how to diagnose and improve models.

OLS Cost Function R² Residuals
1 What is linear regression in one sentence? âš¡ Beginner
Answer: Linear regression models the target as a linear combination of input features plus an intercept term.
2 What is the typical cost function used for linear regression? âš¡ Beginner
Answer: The most common cost is the mean squared error (MSE) between predictions and true values.
3 How are parameters estimated in ordinary least squares (OLS)? 📊 Intermediate
Answer: OLS finds parameters that minimize the sum of squared residuals, often using a closed‑form solution (XᵀX)⁻¹Xᵀy or gradient‑based optimization.
4 List two core assumptions of linear regression. âš¡ Beginner
Answer: Examples: (1) the relationship between X and y is linear, and (2) the residuals have constant variance (homoscedasticity).
5 What does the intercept term represent? âš¡ Beginner
Answer: The intercept is the predicted value of y when all input features are zero (within the domain of the data).
6 What is multicollinearity and why is it a problem? 📊 Intermediate
Answer: Multicollinearity means features are highly correlated; it can make coefficient estimates unstable and hard to interpret.
7 How can you detect non‑linearity in linear regression? 📊 Intermediate
Answer: Plot residuals vs predicted values or vs each feature; systematic curves or patterns indicate that the linear assumption may be violated.
8 What does a coefficient mean in linear regression? âš¡ Beginner
Answer: A coefficient indicates the change in the predicted target for a one‑unit increase in that feature, holding other features constant.
9 Why might you standardize features before linear regression? 📊 Intermediate
Answer: Standardization can improve numerical stability, interpretability of regularized models, and convergence of gradient-based solvers.
10 What is the difference between simple and multiple linear regression? âš¡ Beginner
Answer: Simple linear regression uses one predictor, while multiple linear regression uses two or more predictors to explain the target.
11 What is the role of residuals in regression analysis? 📊 Intermediate
Answer: Residuals (errors) are the difference between observed and predicted values; analyzing them helps check assumptions and model fit.
12 What is heteroscedasticity? 🔥 Advanced
Answer: Heteroscedasticity occurs when the variance of residuals is not constant across different levels of the predictors, violating a key OLS assumption.
13 How can you handle heteroscedasticity? 🔥 Advanced
Answer: Options include transforming the target (e.g., log), using weighted least squares, or switching to models robust to varying variance.
14 What is gradient descent in the context of linear regression? 📊 Intermediate
Answer: Gradient descent iteratively updates the coefficients in the direction that reduces the MSE cost, rather than solving for them in one closed‑form step.
15 How do regularized linear models differ from plain OLS? 🔥 Advanced
Answer: Regularized models (e.g., Ridge/Lasso) add a penalty on coefficient size to the loss function, which can reduce overfitting and handle multicollinearity better.
16 Why might R² increase when you add more features, even useless ones? 🔥 Advanced
Answer: Plain R² never decreases when you add features because the model can always fit training data at least as well, even if the new features are noise.
17 What is adjusted R² and why is it preferred sometimes? 📊 Intermediate
Answer: Adjusted R² penalizes adding features that do not improve the model; it can decrease when unnecessary predictors are added, making it more suitable for model comparison.
18 How do you check if residuals are approximately normal? 🔥 Advanced
Answer: You can use histograms, Q‑Q plots or statistical tests (e.g., Shapiro‑Wilk) to visually and numerically assess residual normality.
19 When might linear regression be a bad choice? 📊 Intermediate
Answer: It is a poor choice when the relationship is strongly non‑linear, heavily interaction‑driven, or when assumptions like homoscedasticity are badly violated.
20 Why is linear regression still important to learn? âš¡ Beginner
Answer: Linear regression is simple, interpretable and fast; it builds intuition for more complex models and remains a strong baseline in many practical problems.

Quick Recap: Linear Regression

Understand what the coefficients mean, when assumptions hold, and how to read residual plots—you will then be able to defend or reject linear regression confidently in interviews.