t-SNE Q&A 20 Core Questions
Interview Prep

t-SNE: Interview Q&A

Short questions and answers on t-SNE for nonlinear dimensionality reduction and data visualization.

Visualization Neighborhoods Perplexity Learning Rate
1 What is t-SNE mainly used for? âš¡ Beginner
Answer: t-SNE is used for visualizing high-dimensional data in 2D or 3D while preserving local neighborhood structure.
2 Is t-SNE a linear or non-linear method? âš¡ Beginner
Answer: t-SNE is a non-linear dimensionality reduction technique.
3 What does t-SNE try to preserve when reducing dimensions? 📊 Intermediate
Answer: It aims to preserve local neighbor relationships by matching pairwise similarity distributions in high and low dimensions.
4 What is perplexity in t-SNE? 🔥 Advanced
Answer: Perplexity is a parameter roughly related to the effective number of neighbors considered for each point.
5 How does the learning rate affect t-SNE? 🔥 Advanced
Answer: Too small a learning rate leads to slow convergence; too large can cause points to crowd or diverge.
6 Why is t-SNE primarily an exploratory tool, not a general-purpose feature reducer? 📊 Intermediate
Answer: t-SNE is non-parametric, stochastic and focuses on visualization; it doesn’t provide a simple mapping for new points and distortions can be hard to interpret quantitatively.
7 Is the global structure in a t-SNE plot always reliable? 🔥 Advanced
Answer: Not necessarily; t-SNE is designed to preserve local structure, so global distances and cluster sizes can be misleading.
8 Should you run t-SNE on raw features or after a step like PCA? 📊 Intermediate
Answer: Often you first apply PCA to reduce dimensionality (e.g., to 30–50 dims) and then run t-SNE for stability and speed.
9 Is t-SNE deterministic? âš¡ Beginner
Answer: No, results vary with random initialization and parameter settings; fixing the random seed improves reproducibility.
10 Is t-SNE suitable as a preprocessing step for clustering? 🔥 Advanced
Answer: Generally no; t-SNE is optimized for visualization, not for preserving cluster geometry needed by clustering algorithms.
11 What does it mean if t-SNE shows well-separated clusters? 📊 Intermediate
Answer: It often indicates that the classes or groups have distinct local neighborhoods in high-dimensional space, but it’s not a rigorous proof.
12 How does t-SNE differ from PCA? 📊 Intermediate
Answer: PCA is a linear, global variance-based method; t-SNE is non-linear and local-neighborhood based, optimized for visualization.
13 Why can t-SNE be slow on large datasets? 🔥 Advanced
Answer: It needs to compute and optimize over pairwise similarities, though approximate and Barnes–Hut variants help scale it up.
14 Which hyperparameters typically require tuning in t-SNE? 📊 Intermediate
Answer: Mainly perplexity, learning rate, number of iterations and sometimes initialization method.
15 What is a typical perplexity range used in practice? âš¡ Beginner
Answer: Values between 5 and 50 are common; trying a few and comparing plots is recommended.
16 How can you misuse t-SNE in an analysis? 🔥 Advanced
Answer: Misuse includes over-interpreting distances and cluster sizes, not checking stability across runs, or using it as evidence of separability without other metrics.
17 Is t-SNE appropriate for streaming or online data? 🔥 Advanced
Answer: Not really; it’s batch-oriented and doesn’t provide a simple incremental update rule for new points.
18 Give a real-world use case where t-SNE is very helpful. âš¡ Beginner
Answer: t-SNE is widely used to visualize embeddings like word vectors, image features or latent representations from neural networks.
19 How can you check if a t-SNE result is robust? 📊 Intermediate
Answer: Re-run t-SNE with different random seeds and parameter settings; stable qualitative patterns increase confidence.
20 What is the key message to remember about t-SNE? âš¡ Beginner
Answer: t-SNE is a powerful visualization tool, not a general-purpose feature extractor; use it to explore structure, but validate findings with other methods.

Quick Recap: t-SNE

Use t-SNE to visually explore embeddings and clusters, always remembering that it focuses on local neighborhoods and is sensitive to parameter choices.