Linear Algebra Basics for Data Science – Q&A

Explain vectors, matrices, and eigenvalues in plain language, and connect them to ML concepts like PCA.

1 What is a vector and what is a matrix in the context of Data Science? easy

Answer: A vector is an ordered list of numbers (e.g., features of one data point). A matrix is a 2‑D table of numbers, often representing many data points (rows) and many features (columns). Most tabular datasets are naturally represented as matrices, and many ML operations are expressed as matrix multiplications.

feature vector design matrix

2 What is the dot product and why is it important in ML? medium

Answer: The dot product of vectors \(x\) and \(w\) is \(x \cdot w = \sum_i x_i w_i\). Geometrically it relates to the cosine of the angle between vectors; in ML it appears in linear models (e.g., \(w^Tx\)), similarity measures, and in the computation inside many neural network layers.

3 Intuitively, what are eigenvalues and eigenvectors and how do they relate to PCA? hard

Answer: For a matrix \(A\), an eigenvector \(v\) is a direction that is only scaled (not rotated) by \(A\), and the scale factor is the corresponding eigenvalue \(\lambda\) (i.e. \(Av = \lambda v\)). In PCA, eigenvectors of the covariance matrix give the principal directions of variance, and eigenvalues tell you how much variance each principal component explains.

4 What is matrix multiplication in DS terms?easy

Answer: It combines linear transformations. In ML, multiplying feature matrix X by weight vector w gives predictions for linear models.

5 Why does shape compatibility matter in matrix operations?easy

Answer: Multiplication A(m×n)·B(n×p) is valid only when inner dimensions match. Shape errors often indicate pipeline bugs in feature or batch handling.

6 What is matrix transpose and where is it used?easy

Answer: Transpose flips rows and columns. It appears in normal equations, covariance calculations, and gradient derivations (e.g., XᵀX).

7 What is rank of a matrix?medium

Answer: Rank is the number of linearly independent rows/columns. Low rank indicates redundant features and can cause instability in regression.

8 What does matrix inverse represent?medium

Answer: For invertible A, A⁻¹ reverses the transformation A. In practice, exact inversion is often avoided for numerical stability; decompositions are preferred.

9 Why is determinant conceptually useful?medium

Answer: Determinant indicates volume scaling and whether a matrix is singular. det(A)=0 implies non-invertible matrix and dependent columns.

10 What is orthogonality and why do we care?medium

Answer: Orthogonal vectors are perpendicular (dot product zero). Orthogonal features/components reduce redundancy and simplify optimization and interpretation.

11 What is vector norm?easy

Answer: Norm is vector magnitude. L2 norm is common for distance/similarity; L1/L2 norms also appear in regularization (Lasso/Ridge).

12 How is cosine similarity connected to dot product?medium

Answer: Cosine similarity = (x·y)/(||x|| ||y||). It measures direction similarity independent of absolute scale; widely used in retrieval/embedding tasks.

13 What is SVD and where is it useful?hard

Answer: Singular Value Decomposition factors A into UΣVᵀ. It is useful for dimensionality reduction, denoising, latent factor models, and numerical robustness.

14 Why do we standardize features before PCA?medium

Answer: PCA is variance-based; large-scale features dominate components. Standardization ensures each feature contributes comparably.

15 One-line linear algebra summary for DS interviews?easy

Answer: Linear algebra is the language of data representation and transformations—vectors, matrices, and decompositions power model training, optimization, and dimensionality reduction.

Previous Next

Related Data Science Links

Linear Algebra Basics for Data Science – Q&A