Inferential Statistics: From Sample to Population

Inferential statistics help you make conclusions about a population using a sample: you estimate parameters, build confidence intervals and run hypothesis tests.

Sampling & Confidence Intervals

A confidence interval (CI) gives a range of plausible values for a population parameter (e.g., the true mean). It is built from a sample but interpreted at the population level.

import numpy as np
from scipy import stats

np.random.seed(0)

# Suppose these are sample observations of a metric (e.g. session length)
sample = np.random.normal(loc=5.0, scale=1.0, size=100)

mean = sample.mean()
std_err = stats.sem(sample)   # standard error of the mean
confidence = 0.95

ci_low, ci_high = stats.t.interval(
    confidence,
    df=len(sample) - 1,
    loc=mean,
    scale=std_err
)

print("Sample mean:", round(mean, 3))
print("95% CI    :", (round(ci_low, 3), round(ci_high, 3)))

Hypothesis Testing & p‑values

In hypothesis testing we start with a null hypothesis \(H_0\) (no effect), and an alternative \(H_1\) (there is an effect). We compute a test statistic and its p‑value to decide whether to reject \(H_0\).

from scipy import stats
import numpy as np

np.random.seed(1)

# Example: one-sample t-test
# H0: true mean = 0, H1: true mean ≠ 0
sample = np.random.normal(loc=0.5, scale=1.0, size=50)

t_stat, p_value = stats.ttest_1samp(sample, popmean=0.0)

print("t statistic:", round(t_stat, 3))
print("p value    :", round(p_value, 4))

alpha = 0.05
if p_value < alpha:
    print("Reject H0 at 5% level")
else:
    print("Fail to reject H0 at 5% level")

Next: Python for Data Science

Related Data Science Links

Sampling & Confidence Intervals

Hypothesis Testing & p‑values