Feb 22, 2026 · 20 min read · Maths for ML

Eigenvalues and Eigenvectors

In this series (15 parts)

Most vectors change direction when you multiply them by a matrix. Eigenvalues and eigenvectors are the special cases where a matrix simply stretches or compresses a vector without rotating it. This turns out to be one of the most important ideas in linear algebra, with direct applications to PCA, spectral clustering, stability analysis, and matrix decompositions.

Prerequisites

This article builds on matrix inverses and linear systems. You should be comfortable with matrix multiplication, determinants, and solving systems of equations.

The core definition

A nonzero vector $\mathbf{v}$ is an eigenvector of a square matrix $A$ if multiplying $A$ by $\mathbf{v}$ gives back a scaled version of $\mathbf{v}$ :

A\mathbf{v} = \lambda \mathbf{v}

The scalar $\lambda$ is the corresponding eigenvalue.

Eigenvector equation: the matrix only scales, not rotates:

graph LR
  V["Input vector v"] --> A["Multiply by matrix A"]
  A --> OUT["Output = lambda * v<br/>Same direction as v<br/>Scaled by eigenvalue lambda"]
  style V fill:#e1f5fe
  style OUT fill:#c8e6c9

Read this equation carefully. On the left, a matrix transforms a vector. On the right, that same vector is simply multiplied by a number. The matrix’s effect on this particular vector is nothing more than scaling.

Why this matters

Most vectors get both scaled and rotated by a matrix. Eigenvectors are special because they only get scaled. This makes them natural “axes” for understanding what a matrix does.

In PCA, the eigenvectors of the covariance matrix are the principal components, the directions of maximum variance in your data.
In gradient descent, the eigenvalues of the Hessian determine how fast you converge along each direction.
In graph algorithms, the eigenvectors of the adjacency matrix reveal community structure.
In dynamical systems, eigenvalues tell you whether the system is stable (all $|\lambda| < 1$ ) or unstable.

Finding eigenvalues: the characteristic polynomial

Start from the defining equation and rearrange:

A\mathbf{v} = \lambda \mathbf{v}

A\mathbf{v} - \lambda \mathbf{v} = \mathbf{0}

(A - \lambda I)\mathbf{v} = \mathbf{0}

We need a nonzero $\mathbf{v}$ , which means the matrix $(A - \lambda I)$ must be singular. A matrix is singular when its determinant is zero:

\det(A - \lambda I) = 0

This is the characteristic equation. The left side, $\det(A - \lambda I)$ , is a polynomial in $\lambda$ called the characteristic polynomial. For an $n \times n$ matrix, this polynomial has degree $n$ , which means there are at most $n$ eigenvalues (counted with multiplicity).

For a 2x2 matrix

Given:

A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}

The characteristic polynomial is:

\det\begin{bmatrix} a - \lambda & b \\ c & d - \lambda \end{bmatrix} = (a - \lambda)(d - \lambda) - bc = 0

\lambda^2 - (a + d)\lambda + (ad - bc) = 0

Notice: the coefficient of $\lambda$ is $-(a + d)$ , which is the negative of the trace (sum of diagonal entries). The constant term is $ad - bc$ , which is the determinant. So:

\lambda^2 - \text{tr}(A) \cdot \lambda + \det(A) = 0

This is a quadratic you can solve with the quadratic formula.

Characteristic polynomial: roots are the eigenvalues

Finding eigenvectors

Once you know an eigenvalue $\lambda$ , find its eigenvector by solving:

(A - \lambda I)\mathbf{v} = \mathbf{0}

This is a homogeneous system. Since $\det(A - \lambda I) = 0$ , there are infinitely many solutions (a whole line or subspace of eigenvectors). You typically pick the simplest nonzero solution.

The set of all eigenvectors for a given eigenvalue, plus the zero vector, is called the eigenspace for that eigenvalue.

Worked example 1: eigenvalues of a 2x2 matrix

Find the eigenvalues of:

A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}

Step 1: set up the characteristic equation.

\det(A - \lambda I) = \det\begin{bmatrix} 4 - \lambda & 1 \\ 2 & 3 - \lambda \end{bmatrix} = 0

Step 2: expand the determinant.

(4 - \lambda)(3 - \lambda) - (1)(2) = 0

12 - 4\lambda - 3\lambda + \lambda^2 - 2 = 0

\lambda^2 - 7\lambda + 10 = 0

Step 3: solve the quadratic.

(\lambda - 5)(\lambda - 2) = 0

\lambda_1 = 5, \quad \lambda_2 = 2

Quick sanity check: $\lambda_1 + \lambda_2 = 5 + 2 = 7 = \text{tr}(A)$ ✓ and $\lambda_1 \cdot \lambda_2 = 5 \times 2 = 10 = \det(A)$ ✓.

import numpy as np

A = np.array([[4, 1], [2, 3]])
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Eigenvalues:", eigenvalues)  # [5. 2.]

Worked example 2: finding eigenvectors

Using the same matrix $A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}$ with eigenvalues $\lambda_1 = 5$ and $\lambda_2 = 2$ .

Eigenvector for $\lambda_1 = 5$ :

Solve $(A - 5I)\mathbf{v} = \mathbf{0}$ :

A - 5I = \begin{bmatrix} 4 - 5 & 1 \\ 2 & 3 - 5 \end{bmatrix} = \begin{bmatrix} -1 & 1 \\ 2 & -2 \end{bmatrix}

The system:

-v_1 + v_2 = 0

2v_1 - 2v_2 = 0

Both equations say the same thing: $v_2 = v_1$ . Choose $v_1 = 1$ :

\mathbf{v}_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}

Eigenvector for $\lambda_2 = 2$ :

Solve $(A - 2I)\mathbf{v} = \mathbf{0}$ :

A - 2I = \begin{bmatrix} 4 - 2 & 1 \\ 2 & 3 - 2 \end{bmatrix} = \begin{bmatrix} 2 & 1 \\ 2 & 1 \end{bmatrix}

The system:

2v_1 + v_2 = 0

So $v_2 = -2v_1$ . Choose $v_1 = 1$ :

\mathbf{v}_2 = \begin{bmatrix} 1 \\ -2 \end{bmatrix}

import numpy as np

A = np.array([[4, 1], [2, 3]])
vals, vecs = np.linalg.eig(A)
print("Eigenvectors (as columns):")
print(vecs)
# Note: NumPy returns normalized eigenvectors, so they may look
# different in scale but point in the same direction.

Worked example 3: verifying $A\mathbf{v} = \lambda\mathbf{v}$

Let’s verify our results. We found that for $A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}$ :

$\lambda_1 = 5$ with $\mathbf{v}_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}$
$\lambda_2 = 2$ with $\mathbf{v}_2 = \begin{bmatrix} 1 \\ -2 \end{bmatrix}$

Verify $\lambda_1 = 5$ , $\mathbf{v}_1 = [1, 1]^T$ :

A\mathbf{v}_1 = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 4(1) + 1(1) \\ 2(1) + 3(1) \end{bmatrix} = \begin{bmatrix} 5 \\ 5 \end{bmatrix}

\lambda_1 \mathbf{v}_1 = 5 \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 5 \\ 5 \end{bmatrix}

A\mathbf{v}_1 = \lambda_1 \mathbf{v}_1 \quad \checkmark

Verify $\lambda_2 = 2$ , $\mathbf{v}_2 = [1, -2]^T$ :

A\mathbf{v}_2 = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix} \begin{bmatrix} 1 \\ -2 \end{bmatrix} = \begin{bmatrix} 4(1) + 1(-2) \\ 2(1) + 3(-2) \end{bmatrix} = \begin{bmatrix} 2 \\ -4 \end{bmatrix}

\lambda_2 \mathbf{v}_2 = 2 \begin{bmatrix} 1 \\ -2 \end{bmatrix} = \begin{bmatrix} 2 \\ -4 \end{bmatrix}

A\mathbf{v}_2 = \lambda_2 \mathbf{v}_2 \quad \checkmark

Both check out. The matrix $A$ scales $\mathbf{v}_1$ by a factor of 5 and $\mathbf{v}_2$ by a factor of 2, without changing their directions.

import numpy as np

A = np.array([[4, 1], [2, 3]])
v1 = np.array([1, 1])
v2 = np.array([1, -2])

print("Av1 =", A @ v1)        # [5 5]
print("5*v1 =", 5 * v1)       # [5 5]
print("Av2 =", A @ v2)        # [ 2 -4]
print("2*v2 =", 2 * v2)       # [ 2 -4]

Geometric meaning

Think of a $2 \times 2$ matrix as a transformation of the plane. It stretches, rotates, and/or reflects every point.

graph LR
  A["Input vector v"] --> B["Multiply by A"]
  B --> C{"Is v an eigenvector?"}
  C -->|Yes| D["Output = λv<br/>(same direction, scaled)"]
  C -->|No| E["Output = Av<br/>(new direction and scale)"]

Eigenvectors are the directions that survive the transformation unchanged (up to scaling). The eigenvalue tells you the scaling factor along that direction.

If both eigenvalues are positive, the matrix stretches in both eigenvector directions. If one is negative, it flips the direction along that eigenvector. If an eigenvalue is zero, the matrix collapses that direction entirely.

A concrete picture

For our matrix $A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}$ :

Along $\mathbf{v}_1 = [1, 1]^T$ : stretch by factor 5.
Along $\mathbf{v}_2 = [1, -2]^T$ : stretch by factor 2.

Any vector in $\mathbb{R}^2$ can be decomposed into a combination of $\mathbf{v}_1$ and $\mathbf{v}_2$ . So the matrix’s action on any vector is determined by these two stretching factors.

Diagonalization

If an $n \times n$ matrix $A$ has $n$ linearly independent eigenvectors, we can diagonalize it. Arrange the eigenvectors as columns of a matrix $P$ and the eigenvalues on the diagonal of a matrix $D$ :

P = \begin{bmatrix} \mathbf{v}_1 & \mathbf{v}_2 & \cdots & \mathbf{v}_n \end{bmatrix}, \quad D = \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix}

Then:

A = PDP^{-1}

This is called the eigendecomposition of $A$ .

Diagonalization process:

graph TD
  A["Start with matrix A"] --> EV["Find eigenvalues<br/>Solve det of A - lambda I = 0"]
  EV --> EC["Find eigenvectors<br/>Solve A - lambda I times v = 0"]
  EC --> P["Form P<br/>Eigenvectors as columns"]
  EC --> D["Form D<br/>Eigenvalues on diagonal"]
  P --> RES["A = P D P-inverse"]
  D --> RES
  style A fill:#e1f5fe
  style RES fill:#c8e6c9

Why diagonalization is useful

Fast matrix powers. $A^k = PD^kP^{-1}$ . Since $D$ is diagonal, $D^k$ just raises each diagonal entry to the $k$ -th power. This turns $O(n^3 k)$ into $O(n^3)$ .
Understanding transformations. The decomposition says: “change to the eigenvector basis ( $P^{-1}$ ), scale along each axis ( $D$ ), change back ( $P$ ).”
PCA. The covariance matrix is symmetric, so it is always diagonalizable. Its eigenvectors form an orthogonal basis, and the eigenvalues tell you the variance along each principal component.

Diagonalizing our example

For $A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}$ with $\lambda_1 = 5$ , $\mathbf{v}_1 = [1, 1]^T$ , $\lambda_2 = 2$ , $\mathbf{v}_2 = [1, -2]^T$ :

P = \begin{bmatrix} 1 & 1 \\ 1 & -2 \end{bmatrix}, \quad D = \begin{bmatrix} 5 & 0 \\ 0 & 2 \end{bmatrix}

First, compute $P^{-1}$ . Using the $2 \times 2$ inverse formula:

\det(P) = (1)(-2) - (1)(1) = -3

P^{-1} = \frac{1}{-3}\begin{bmatrix} -2 & -1 \\ -1 & 1 \end{bmatrix} = \begin{bmatrix} 2/3 & 1/3 \\ 1/3 & -1/3 \end{bmatrix}

Verify $PDP^{-1} = A$ :

PD = \begin{bmatrix} 1 & 1 \\ 1 & -2 \end{bmatrix}\begin{bmatrix} 5 & 0 \\ 0 & 2 \end{bmatrix} = \begin{bmatrix} 5 & 2 \\ 5 & -4 \end{bmatrix}

PDP^{-1} = \begin{bmatrix} 5 & 2 \\ 5 & -4 \end{bmatrix}\begin{bmatrix} 2/3 & 1/3 \\ 1/3 & -1/3 \end{bmatrix}

Entry $(1,1)$ : $5(2/3) + 2(1/3) = 10/3 + 2/3 = 12/3 = 4$ ✓

Entry $(1,2)$ : $5(1/3) + 2(-1/3) = 5/3 - 2/3 = 3/3 = 1$ ✓

Entry $(2,1)$ : $5(2/3) + (-4)(1/3) = 10/3 - 4/3 = 6/3 = 2$ ✓

Entry $(2,2)$ : $5(1/3) + (-4)(-1/3) = 5/3 + 4/3 = 9/3 = 3$ ✓

PDP^{-1} = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix} = A \quad \checkmark

import numpy as np

A = np.array([[4, 1], [2, 3]])
vals, P = np.linalg.eig(A)
D = np.diag(vals)
P_inv = np.linalg.inv(P)

# Reconstruct A
A_reconstructed = P @ D @ P_inv
print(np.allclose(A, A_reconstructed))  # True

Special cases and pitfalls

Repeated eigenvalues

A matrix can have repeated eigenvalues. The $2 \times 2$ identity matrix has $\lambda = 1$ with multiplicity 2. It is still diagonalizable (it is already diagonal).

But some matrices with repeated eigenvalues are not diagonalizable:

A = \begin{bmatrix} 3 & 1 \\ 0 & 3 \end{bmatrix}

This has $\lambda = 3$ with multiplicity 2, but only one linearly independent eigenvector. Such matrices require the Jordan normal form instead of a simple diagonalization.

Complex eigenvalues

Some real matrices have complex eigenvalues. For example:

A = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}

The characteristic equation is $\lambda^2 + 1 = 0$ , giving $\lambda = \pm i$ . This matrix is a 90-degree rotation, so no real vector stays on its line after transformation. The complex eigenvalues encode the rotation angle.

Symmetric matrices are special

If $A$ is symmetric ( $A = A^T$ ), you get two guarantees:

All eigenvalues are real.
The eigenvectors are orthogonal to each other.

This makes symmetric matrices much easier to work with. Covariance matrices are symmetric, which is why PCA works so cleanly.

Eigenvalues in ML: a quick reference

Application	What you compute	What eigenvalues tell you
PCA	Eigenvectors of covariance matrix	Variance along each principal component
Spectral clustering	Eigenvectors of graph Laplacian	Cluster structure in the data
Gradient descent convergence	Eigenvalues of Hessian	Condition number, convergence speed
Stability of recurrent networks	Eigenvalues of weight matrix	Whether gradients explode or vanish
PageRank	Dominant eigenvector of link matrix	Importance of each page

Summary

Concept	Definition
Eigenvalue $\lambda$	Scalar satisfying $A\mathbf{v} = \lambda\mathbf{v}$
Eigenvector $\mathbf{v}$	Nonzero vector scaled but not rotated by $A$
Characteristic polynomial	$\det(A - \lambda I) = 0$ ; its roots are the eigenvalues
Eigenspace	All eigenvectors for a given $\lambda$ , plus $\mathbf{0}$
Diagonalization	$A = PDP^{-1}$ where $D$ is diagonal
Trace	Sum of eigenvalues $= a_{11} + a_{22} + \cdots$
Determinant	Product of eigenvalues

What comes next

Eigenvalues give you one way to decompose a matrix. The next article covers matrix decompositions more broadly, including SVD (singular value decomposition), which generalizes eigendecomposition to non-square matrices and is behind dimensionality reduction, recommendation systems, and low-rank approximations.

← Back to all series