NumPy for Data Science Beginner
~12 min read

NumPy Arrays, Vectorization & Broadcasting

NumPy provides the fast array operations that power almost all numerical computing in Python. Understanding arrays and broadcasting will make your code much faster and cleaner.

Creating & Inspecting Arrays

Conceptually, a NumPy array is a contiguous block of memory described by three things: the data type (dtype), the shape (dimensions) and the strides (how many bytes to step in each dimension). This design lets NumPy perform vectorized operations very quickly in compiled C code instead of slow Python loops.

import numpy as np

# 1D and 2D arrays
v = np.array([1, 2, 3])
M = np.array([[1, 2, 3],
              [4, 5, 6]])

print("v shape:", v.shape)   # (3,)
print("M shape:", M.shape)   # (2, 3)

# Ranges and random
arr = np.arange(0, 10, 2)    # [0, 2, 4, 6, 8]
rand = np.random.randn(3, 3) # standard normal matrix

print("arr:", arr)
print("rand mean:", rand.mean())

Slicing, Boolean Indexing & Broadcasting

import numpy as np

x = np.array([10, 20, 30, 40, 50])

# Slicing
print(x[1:4])        # [20 30 40]

# Boolean indexing
mask = x >= 30
print("mask:", mask)
print("x[mask]:", x[mask])

# Broadcasting: add a scalar to all elements
print("x + 5:", x + 5)

M = np.array([[1, 2, 3],
              [4, 5, 6]])
col_means = M.mean(axis=0)

# Subtract column means from each row (broadcasting)
centered = M - col_means
print("centered:\n", centered)

Basic Linear Algebra with NumPy

import numpy as np

A = np.array([[1, 2],
              [3, 4]])
b = np.array([5, 6])

# Matrix-vector product
y = A @ b

# Transpose, inverse, eigenvalues
AT = A.T
invA = np.linalg.inv(A)
eig_vals, eig_vecs = np.linalg.eig(A)

print("A @ b:", y)
print("A^T:\n", AT)
print("A inverse:\n", invA)
print("Eigenvalues:", eig_vals)