NumPy (Numerical Python) is the foundation of high-performance scientific computing and machine learning in Python. It introduces the powerful N-dimensional array object (ndarray) and provides vectorized mathematical operations executed in optimized C-libraries.
This reference sheet covers array initialization, indexing, broadcasting, linear algebra, and memory optimization.
Before diving into this cheatsheet, check out my previous deep-dive on Pandas Dataframe & Operations Cheatsheet: The Complete Reference to see how we structured these patterns in practice.
Initializing Multi-Dimensional Arrays
NumPy provides fast utilities to construct pre-populated structures without slow Python loop iterations.
import numpy as np
# 1. Base Array Allocations
arr_zeros = np.zeros((3, 4), dtype=np.float32)
arr_ones = np.ones((2, 3), dtype=np.int32)
arr_empty = np.empty((2, 2)) # Allocates memory block without resetting values (faster)
# 2. Sequential Generators
arr_range = np.arange(0, 10, 2) # [0, 2, 4, 6, 8]
arr_linear = np.linspace(0, 1, 5) # [0., 0.25, 0.5, 0.75, 1.]
# 3. Random Sample Generators
arr_random_uniform = np.random.rand(3, 3) # Uniformly distributed [0, 1)
arr_random_normal = np.random.randn(1000) # Normal distribution (mean=0, variance=1)
# 4. Core Array Attributes
print(arr_zeros.shape) # (3, 4) - dimensions
print(arr_zeros.ndim) # 2 - number of axes
print(arr_zeros.dtype) # float32 - storage data type
Indexing & Slice Extractions
NumPy slicing returns a view rather than a copy. Modifying a slice directly changes the original array. Use .copy() explicitly if you require an isolated object.
# Create a 2D array of shape (4, 4)
matrix = np.arange(16).reshape((4, 4))
"""
[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]]
"""
# 1. Sub-matrix extraction (striding)
# Extract rows 0 to 2, and columns 1 to 3
slice_view = matrix[0:3, 1:4]
# 2. Boolean Mask Indexing
mask = matrix > 10
filtered_elements = matrix[mask] # Returns a 1D copy of elements exceeding 10
# 3. Fancy Indexing (Extracting non-contiguous elements)
# Extract specific coordinates: (0, 1) and (2, 3)
fancy_result = matrix[[0, 2], [1, 3]] # Returns [1, 11]
Vectorized Math & Broadcasting Rules
Broadcasting allows arithmetic operations on arrays of differing shapes. NumPy evaluates dimension sizes from right to left (trailing dimensions). Two dimensions are compatible if they are equal or if one of them is 1.
# Array A: Shape (3, 1)
# Array B: Shape (1, 4)
a = np.array([[10], [20], [30]])
b = np.array([1, 2, 3, 4])
# During addition, both arrays broadcast to shape (3, 4)
broadcasted_sum = a + b
"""
[[11, 12, 13, 14],
[21, 22, 23, 24],
[31, 32, 33, 34]]
"""
# Standard element-wise calculations (extremely fast vectorization)
log_array = np.log(a)
exp_array = np.exp(b)
Fast Linear Algebra
The np.linalg sub-module offers highly optimized routines for solving linear equations, calculating determinants, and decomposing matrices.
x = np.array([[1, 2], [3, 4]], dtype=np.float32)
y = np.array([[5, 6], [7, 8]], dtype=np.float32)
# 1. Matrix Multiplication
matrix_product = np.matmul(x, y) # Or using the decorator: x @ y
# 2. Transpose Matrix
transposed_x = x.T
# 3. Matrix Inverse & Determinant
inverse_x = np.linalg.inv(x)
det_x = np.linalg.det(x)
# 4. Solve System of Linear Equations: Ax = B
# 1x + 2y = 5
# 3x + 4y = 11
a = np.array([[1, 2], [3, 4]])
b = np.array([5, 11])
solution = np.linalg.solve(a, b) # Returns [1., 2.]
Memory Layouts & Performance Optimization
NumPy arrays are stored in flat, contiguous memory blocks. Speed changes dramatically depending on whether elements are read in Row-Major (C order) or Column-Major (Fortran order) structures.
# 1. Understanding Contiguity
# Default allocation is C-contiguous (Row-Major)
c_array = np.ones((5000, 5000), order='C')
# F-contiguous allocation (Column-Major)
f_array = np.ones((5000, 5000), order='F')
# Summing rows on a C-contiguous array is much faster than summing columns
# due to cache-line hit efficiency in CPU memory controllers.
%timeit c_array.sum(axis=1) # Fast: scans contiguous rows
%timeit c_array.sum(axis=0) # Slower: skips across memory addresses
# 2. Force conversion to contiguous layouts
optimized_contiguous = np.ascontiguousarray(f_array) Related Articles
Deepen your understanding with these curated continuations.
Pandas Dataframe & Operations Cheatsheet: The Complete Reference
A comprehensive reference for Pandas: dataframes, series, indexing, merging, grouping, aggregations, and high-performance optimizations.
PyTorch & CUDA ML Operations Cheatsheet: The Complete Reference
Optimize deep learning workloads: PyTorch tensor manipulations, CUDA memory management, multi-GPU training, and mixed precision.
FastAPI & Pydantic v2 Boilerplate Cheatsheet: The Complete Reference
Build high-performance APIs: FastAPI routers, Pydantic v2 models, dependency injection, async database integration, and security.