PCA Q&A 20 Core Questions
Interview Prep

Principal Component Analysis (PCA): Interview Q&A

Short questions and answers on PCA: dimensionality reduction, components, explained variance and typical use cases.

Dimensionality Eigenvectors Explained Variance Projections
1 What problem does PCA solve? âš¡ Beginner
Answer: PCA reduces the dimensionality of data while retaining as much variance (information) as possible.
2 What are principal components? âš¡ Beginner
Answer: Principal components are new orthogonal axes (directions) in feature space along which the data has maximum variance.
3 Why should features be standardized before applying PCA? 📊 Intermediate
Answer: Without scaling, PCA would be dominated by features with larger numeric ranges, since variance is scale-dependent.
4 How is PCA related to eigenvalues and eigenvectors? 🔥 Advanced
Answer: PCA computes the eigenvectors and eigenvalues of the covariance matrix; eigenvectors are component directions, eigenvalues measure captured variance.
5 What is explained variance ratio in PCA? 📊 Intermediate
Answer: It is the fraction of total variance captured by each principal component, used to decide how many components to keep.
6 Is PCA supervised or unsupervised? âš¡ Beginner
Answer: PCA is an unsupervised technique; it ignores labels and focuses only on the feature covariance structure.
7 Can PCA improve model performance? 📊 Intermediate
Answer: Sometimes—by reducing noise, multicollinearity and overfitting, it can help some models, but it may also remove useful information.
8 How do you decide how many principal components to keep? 📊 Intermediate
Answer: Common methods: keep enough components to explain a target proportion of variance (e.g., 95%) or inspect the scree plot for an elbow.
9 Is PCA good for interpretability of original features? 📊 Intermediate
Answer: Not usually—components are linear combinations of all features, which can be hard to interpret directly.
10 How is PCA useful for visualization? âš¡ Beginner
Answer: PCA can project high-dimensional data down to 2D or 3D, making cluster and structure visualization easier.
11 Does PCA assume linear relationships? 🔥 Advanced
Answer: Yes, PCA captures linear correlations; it may miss complex non-linear structure without extensions like kernel PCA.
12 How does PCA relate to the covariance matrix? 🔥 Advanced
Answer: PCA finds directions (eigenvectors) that diagonalize the covariance matrix, concentrating variance along principal axes.
13 When might PCA hurt model performance? 📊 Intermediate
Answer: When important predictive information lies in low-variance directions or when interpretability of original features is critical.
14 Should PCA be fit on training data only or on the full dataset? 📊 Intermediate
Answer: Fit PCA on the training data only to avoid information leakage, then apply the learned transform to validation/test data.
15 Is PCA sensitive to outliers? 🔥 Advanced
Answer: Yes, outliers can significantly affect the covariance matrix and distort components; robust PCA variants exist.
16 Can PCA be used before clustering? 📊 Intermediate
Answer: Yes, PCA is often applied to reduce dimensionality and noise before clustering algorithms like k-means.
17 How does kernel PCA differ from standard PCA? 🔥 Advanced
Answer: Kernel PCA uses the kernel trick to perform PCA in a high-dimensional feature space, capturing non-linear structure.
18 Give a real-world use case where PCA is helpful. âš¡ Beginner
Answer: PCA is used for image compression, noise reduction, exploratory analysis and visualizing high-dimensional datasets.
19 Does PCA keep class separability in supervised problems? 🔥 Advanced
Answer: Not necessarily; PCA optimizes for variance, not class separation, so supervised methods like LDA might be better for that goal.
20 What is the key message to remember about PCA? âš¡ Beginner
Answer: PCA is a powerful linear tool for compression and visualization; always scale features, avoid leakage, and check that the lost dimensions aren’t crucial for your task.

Quick Recap: PCA

Understand variance, eigenvectors and projections—once you do, PCA becomes an intuitive way to simplify complex datasets before modeling or visualization.