Probability Theory

Probability Interview Q&A for Data Science

Important probability concepts that appear in DS and ML interviews.

1What is probability?easy
Answer: Probability quantifies uncertainty, ranging from 0 (impossible) to 1 (certain).
2What is a random variable?easy
Answer: A random variable maps outcomes of a random process to numeric values.
3Difference between PMF and PDF?medium
Answer: PMF is for discrete variables; PDF is for continuous variables where area under the curve gives probability.
4What is expectation (mean)?easy
Answer: Expected value is the long-run average outcome of a random variable.
5What is variance?easy
Answer: Variance measures spread around the mean; standard deviation is its square root.
6What does Bayes theorem state?medium
Answer: Posterior ∝ Likelihood × Prior. It updates beliefs when new evidence arrives.
7What are independent events?easy
Answer: Events are independent if occurrence of one does not change probability of the other.
8What is conditional probability?easy
Answer: Probability of A given B: P(A|B)=P(A∩B)/P(B), assuming P(B)>0.
9When do we use normal distribution?medium
Answer: For many naturally aggregated phenomena and as an approximation via the Central Limit Theorem.
10What is CLT?medium
Answer: Central Limit Theorem: sample mean tends toward normal distribution as sample size increases.
11Binomial vs Poisson?medium
Answer: Binomial models fixed number of Bernoulli trials; Poisson models count of events in interval with average rate λ.
12Why is likelihood different from probability?hard
Answer: Probability treats parameters as fixed and data as random; likelihood treats observed data fixed and parameters variable.
13What is prior vs posterior?medium
Answer: Prior is belief before data; posterior is updated belief after observing data using Bayes theorem.
14How does probability help in classification?medium
Answer: Models estimate class probabilities so decisions can be thresholded based on risk/cost trade-offs.
15One-line probability summary for interviews?easy
Answer: Probability provides the mathematical framework for uncertainty, inference, and model confidence in Data Science.