Principal Component Analysis (PCA): Interview Q&A

Short questions and answers on PCA: dimensionality reduction, components, explained variance and typical use cases.

Dimensionality Eigenvectors Explained Variance Projections

1 What problem does PCA solve? âš¡ Beginner

Answer: PCA reduces the dimensionality of data while retaining as much variance (information) as possible.

2 What are principal components? âš¡ Beginner

Answer: Principal components are new orthogonal axes (directions) in feature space along which the data has maximum variance.

3 Why should features be standardized before applying PCA? ðŸ“Š Intermediate

Answer: Without scaling, PCA would be dominated by features with larger numeric ranges, since variance is scale-dependent.

4 How is PCA related to eigenvalues and eigenvectors? ðŸ”¥ Advanced

Answer: PCA computes the eigenvectors and eigenvalues of the covariance matrix; eigenvectors are component directions, eigenvalues measure captured variance.

5 What is explained variance ratio in PCA? ðŸ“Š Intermediate

Answer: It is the fraction of total variance captured by each principal component, used to decide how many components to keep.

6 Is PCA supervised or unsupervised? âš¡ Beginner

Answer: PCA is an unsupervised technique; it ignores labels and focuses only on the feature covariance structure.

7 Can PCA improve model performance? ðŸ“Š Intermediate

Answer: Sometimesâ€”by reducing noise, multicollinearity and overfitting, it can help some models, but it may also remove useful information.

8 How do you decide how many principal components to keep? ðŸ“Š Intermediate

Answer: Common methods: keep enough components to explain a target proportion of variance (e.g., 95%) or inspect the scree plot for an elbow.

9 Is PCA good for interpretability of original features? ðŸ“Š Intermediate

Answer: Not usuallyâ€”components are linear combinations of all features, which can be hard to interpret directly.

10 How is PCA useful for visualization? âš¡ Beginner

Answer: PCA can project high-dimensional data down to 2D or 3D, making cluster and structure visualization easier.

11 Does PCA assume linear relationships? ðŸ”¥ Advanced

Answer: Yes, PCA captures linear correlations; it may miss complex non-linear structure without extensions like kernel PCA.

12 How does PCA relate to the covariance matrix? ðŸ”¥ Advanced

Answer: PCA finds directions (eigenvectors) that diagonalize the covariance matrix, concentrating variance along principal axes.

13 When might PCA hurt model performance? ðŸ“Š Intermediate

Answer: When important predictive information lies in low-variance directions or when interpretability of original features is critical.

14 Should PCA be fit on training data only or on the full dataset? ðŸ“Š Intermediate

Answer: Fit PCA on the training data only to avoid information leakage, then apply the learned transform to validation/test data.

15 Is PCA sensitive to outliers? ðŸ”¥ Advanced

Answer: Yes, outliers can significantly affect the covariance matrix and distort components; robust PCA variants exist.

16 Can PCA be used before clustering? ðŸ“Š Intermediate

Answer: Yes, PCA is often applied to reduce dimensionality and noise before clustering algorithms like k-means.

17 How does kernel PCA differ from standard PCA? ðŸ”¥ Advanced

Answer: Kernel PCA uses the kernel trick to perform PCA in a high-dimensional feature space, capturing non-linear structure.

18 Give a real-world use case where PCA is helpful. âš¡ Beginner

Answer: PCA is used for image compression, noise reduction, exploratory analysis and visualizing high-dimensional datasets.

19 Does PCA keep class separability in supervised problems? ðŸ”¥ Advanced

Answer: Not necessarily; PCA optimizes for variance, not class separation, so supervised methods like LDA might be better for that goal.

20 What is the key message to remember about PCA? âš¡ Beginner

Answer: PCA is a powerful linear tool for compression and visualization; always scale features, avoid leakage, and check that the lost dimensions arenâ€™t crucial for your task.

Quick Recap: PCA

Understand variance, eigenvectors and projectionsâ€”once you do, PCA becomes an intuitive way to simplify complex datasets before modeling or visualization.

Back: DBSCAN Q&A Next: t-SNE Q&A

Related Machine Learning Links

Principal Component Analysis (PCA): Interview Q&A

Quick Recap: PCA