Feb 21, 2026 · 20 min read · Maths for ML

Matrix Inverses and Systems of Linear Equations

In this series (15 parts)

Solving equations is the bread and butter of applied mathematics, and ML is no exception. Linear regression solves a system of equations. The normal equations for least squares are a linear system. Even training neural networks boils down to repeatedly solving linear approximations of a nonlinear problem. This article teaches you how to solve $A\mathbf{x} = \mathbf{b}$ and understand when a solution exists.

Prerequisites

You should be comfortable with vectors and matrix multiplication before reading this article.

Systems of linear equations

A system of linear equations looks like this:

2x + 3y = 8

x - y = 1

Two equations, two unknowns. You are looking for values of $x$ and $y$ that satisfy both equations simultaneously.

We can write any such system in matrix form:

A\mathbf{x} = \mathbf{b}

where:

A = \begin{bmatrix} 2 & 3 \\ 1 & -1 \end{bmatrix}, \quad \mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 8 \\ 1 \end{bmatrix}

This compact notation works for any size system. A system with 100 equations and 100 unknowns still looks like $A\mathbf{x} = \mathbf{b}$ , just with bigger matrices.

Three possible outcomes

A linear system has either:

Exactly one solution. The typical case when $A$ is square and invertible.
No solution. The equations contradict each other (parallel lines in 2D).
Infinitely many solutions. The equations are redundant (same line in 2D).

Three cases when solving a 2-equation, 2-variable system:

graph TD
  subgraph "Unique Solution"
      A1["Two lines cross<br/>at one point"] --> R1["Exactly one solution<br/>det A != 0"]
  end
  subgraph "No Solution"
      A2["Two lines are parallel<br/>never intersect"] --> R2["No solution exists<br/>Contradictory equations"]
  end
  subgraph "Infinite Solutions"
      A3["Two lines are the same<br/>overlap completely"] --> R3["Infinitely many solutions<br/>Redundant equations"]
  end

The matrix inverse

If $A$ is a square $n \times n$ matrix, its inverse $A^{-1}$ (if it exists) is the unique matrix such that:

A A^{-1} = A^{-1} A = I

where $I$ is the identity matrix.

If $A^{-1}$ exists, solving $A\mathbf{x} = \mathbf{b}$ is straightforward:

\mathbf{x} = A^{-1}\mathbf{b}

Multiply both sides by $A^{-1}$ on the left, and the $A$ cancels out.

When does the inverse exist?

A matrix is invertible (also called nonsingular) if and only if:

Its determinant is nonzero: $\det(A) \neq 0$ .
Its rows (or columns) are linearly independent.
The system $A\mathbf{x} = \mathbf{0}$ has only the trivial solution $\mathbf{x} = \mathbf{0}$ .
It has $n$ pivots during Gaussian elimination.

These conditions are all equivalent. If any one holds, all hold. If any one fails, $A$ is singular (not invertible).

Is the matrix invertible?

graph TD
  Q["Is matrix A invertible?"] --> D{"det A != 0?"}
  D -->|"Yes"| R{"Rows linearly<br/>independent?"}
  R -->|"Yes"| F{"Full rank?"}
  F -->|"Yes"| INV["A is invertible<br/>Unique solution to Ax = b"]
  D -->|"No"| SING["A is singular<br/>No unique inverse"]
  R -->|"No"| SING
  F -->|"No"| SING
  style INV fill:#c8e6c9
  style SING fill:#ffcdd2

The 2x2 inverse formula

For a $2 \times 2$ matrix, there is a clean formula:

A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \quad \Rightarrow \quad A^{-1} = \frac{1}{ad - bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}

The quantity $ad - bc$ is the determinant of $A$ , written $\det(A)$ . If $\det(A) = 0$ , the matrix has no inverse.

This formula is quick and useful for hand calculations. For larger matrices, you need Gaussian elimination.

Gaussian elimination

Gaussian elimination is the standard algorithm for solving linear systems. The idea is to use elementary row operations to transform the system into a simpler form that is easy to solve.

Elementary row operations

Three operations that do not change the solution:

Swap two rows.
Scale a row by a nonzero constant.
Add a multiple of one row to another row.

The augmented matrix

Write the system $A\mathbf{x} = \mathbf{b}$ as an augmented matrix $[A \mid \mathbf{b}]$ by placing $\mathbf{b}$ next to $A$ :

\left[\begin{array}{cc|c} 2 & 3 & 8 \\ 1 & -1 & 1 \end{array}\right]

Then apply row operations until you reach row echelon form (upper triangular), and solve by back substitution.

Gaussian elimination steps:

graph TD
  S1["Start: Augmented matrix A|b"] --> S2["Apply row operations<br/>Swap, scale, add multiples"]
  S2 --> S3["Reach row echelon form<br/>Upper triangular"]
  S3 --> S4["Back substitution<br/>Solve from bottom row up"]
  S4 --> S5["Solution vector x"]
  style S1 fill:#e1f5fe
  style S5 fill:#c8e6c9

Two lines intersecting at a unique solution (2, 5)

Worked example 1: solving a 2x2 system

Solve the system:

2x + 3y = 8

x - y = 1

Method 1: using the inverse formula.

A = \begin{bmatrix} 2 & 3 \\ 1 & -1 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 8 \\ 1 \end{bmatrix}

Compute the determinant:

\det(A) = (2)(-1) - (3)(1) = -2 - 3 = -5

Since $\det(A) = -5 \neq 0$ , the inverse exists:

A^{-1} = \frac{1}{-5} \begin{bmatrix} -1 & -3 \\ -1 & 2 \end{bmatrix} = \begin{bmatrix} 1/5 & 3/5 \\ 1/5 & -2/5 \end{bmatrix}

Now multiply:

\mathbf{x} = A^{-1}\mathbf{b} = \begin{bmatrix} 1/5 & 3/5 \\ 1/5 & -2/5 \end{bmatrix} \begin{bmatrix} 8 \\ 1 \end{bmatrix}

x = (1/5)(8) + (3/5)(1) = 8/5 + 3/5 = 11/5 = 2.2

y = (1/5)(8) + (-2/5)(1) = 8/5 - 2/5 = 6/5 = 1.2

Check: plug back into the original equations:

2(2.2) + 3(1.2) = 4.4 + 3.6 = 8 \quad \checkmark

2.2 - 1.2 = 1 \quad \checkmark

import numpy as np

A = np.array([[2, 3], [1, -1]])
b = np.array([8, 1])
x = np.linalg.solve(A, b)
print(x)  # [2.2 1.2]

Worked example 2: solving a 3x3 system with Gaussian elimination

Solve:

x + 2y + z = 9

2x + 5y + 2z = 21

3x + 7y + 4z = 30

Step 1: write the augmented matrix.

\left[\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 2 & 5 & 2 & 21 \\ 3 & 7 & 4 & 30 \end{array}\right]

Step 2: eliminate below the first pivot.

$R_2 \leftarrow R_2 - 2R_1$ :

\left[\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 0 & 1 & 0 & 3 \\ 3 & 7 & 4 & 30 \end{array}\right]

$R_3 \leftarrow R_3 - 3R_1$ :

\left[\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 0 & 1 & 0 & 3 \\ 0 & 1 & 1 & 3 \end{array}\right]

Step 3: eliminate below the second pivot.

$R_3 \leftarrow R_3 - R_2$ :

\left[\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 0 & 1 & 0 & 3 \\ 0 & 0 & 1 & 0 \end{array}\right]

Step 4: back substitution.

From row 3: $z = 0$ .

From row 2: $y = 3$ .

From row 1: $x + 2(3) + 0 = 9$ , so $x = 9 - 6 = 3$ .

Solution: $x = 3$ , $y = 3$ , $z = 0$ .

Check: substitute into all three original equations:

3 + 2(3) + 0 = 3 + 6 + 0 = 9 \quad \checkmark

2(3) + 5(3) + 2(0) = 6 + 15 + 0 = 21 \quad \checkmark

3(3) + 7(3) + 4(0) = 9 + 21 + 0 = 30 \quad \checkmark

import numpy as np

A = np.array([[1, 2, 1],
              [2, 5, 2],
              [3, 7, 4]])
b = np.array([9, 21, 30])
x = np.linalg.solve(A, b)
print(x)  # [3. 3. 0.]

Worked example 3: a system with a unique twist

Solve:

3x + y = 7

x + 2y = 6

And then verify by computing $A^{-1}$ explicitly.

Step 1: set up.

A = \begin{bmatrix} 3 & 1 \\ 1 & 2 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 7 \\ 6 \end{bmatrix}

Step 2: compute the determinant.

\det(A) = (3)(2) - (1)(1) = 6 - 1 = 5

Step 3: compute the inverse.

A^{-1} = \frac{1}{5} \begin{bmatrix} 2 & -1 \\ -1 & 3 \end{bmatrix} = \begin{bmatrix} 2/5 & -1/5 \\ -1/5 & 3/5 \end{bmatrix}

Step 4: solve.

\mathbf{x} = A^{-1}\mathbf{b} = \begin{bmatrix} 2/5 & -1/5 \\ -1/5 & 3/5 \end{bmatrix} \begin{bmatrix} 7 \\ 6 \end{bmatrix}

x = (2/5)(7) + (-1/5)(6) = 14/5 - 6/5 = 8/5 = 1.6

y = (-1/5)(7) + (3/5)(6) = -7/5 + 18/5 = 11/5 = 2.2

Step 5: verify.

Check $A A^{-1} = I$ :

\begin{bmatrix} 3 & 1 \\ 1 & 2 \end{bmatrix} \begin{bmatrix} 2/5 & -1/5 \\ -1/5 & 3/5 \end{bmatrix}

Entry $(1,1)$ : $(3)(2/5) + (1)(-1/5) = 6/5 - 1/5 = 5/5 = 1$ ✓

Entry $(1,2)$ : $(3)(-1/5) + (1)(3/5) = -3/5 + 3/5 = 0$ ✓

Entry $(2,1)$ : $(1)(2/5) + (2)(-1/5) = 2/5 - 2/5 = 0$ ✓

Entry $(2,2)$ : $(1)(-1/5) + (2)(3/5) = -1/5 + 6/5 = 5/5 = 1$ ✓

A A^{-1} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = I \quad \checkmark

Check the solution in the original equations:

3(1.6) + 2.2 = 4.8 + 2.2 = 7 \quad \checkmark

1.6 + 2(2.2) = 1.6 + 4.4 = 6 \quad \checkmark

When things go wrong: singular matrices

Not every matrix has an inverse. Consider:

A = \begin{bmatrix} 2 & 4 \\ 1 & 2 \end{bmatrix}

The determinant is $(2)(2) - (4)(1) = 4 - 4 = 0$ . This matrix is singular.

Look at the rows: row 1 is exactly 2 times row 2. The rows are linearly dependent. Geometrically, both equations describe the same line (or parallel lines, depending on $\mathbf{b}$ ), so there is either no solution or infinitely many.

In ML, singular matrices cause problems. If your feature matrix $X^T X$ is singular (or close to singular), the normal equations for linear regression have no unique solution. This is why regularization (adding $\lambda I$ to $X^T X$ ) is so important: it pushes the matrix away from singularity.

Computational considerations

In practice, you almost never compute matrix inverses explicitly. Here is why:

Numerical stability. Computing $A^{-1}$ and then multiplying by $\mathbf{b}$ amplifies rounding errors. Gaussian elimination applied directly to $[A \mid \mathbf{b}]$ is more stable.
Speed. Solving $A\mathbf{x} = \mathbf{b}$ directly takes roughly $\frac{2n^3}{3}$ operations. Computing $A^{-1}$ first takes $2n^3$ operations, then multiplying takes $2n^2$ more. Direct solving is faster.
Memory. You need to store the entire inverse matrix, which is $n^2$ entries.

Use np.linalg.solve(A, b) instead of np.linalg.inv(A) @ b. The result is the same, but solve is faster and more accurate.

import numpy as np

A = np.array([[1, 2, 1],
              [2, 5, 2],
              [3, 7, 4]])
b = np.array([9, 21, 30])

# Preferred: direct solve
x = np.linalg.solve(A, b)

# Avoid: explicit inverse
x_bad = np.linalg.inv(A) @ b

# Same result, but solve() is faster and more numerically stable
print(np.allclose(x, x_bad))  # True

The determinant

The determinant is a single number that encodes important information about a matrix.

For a $2 \times 2$ matrix:

\det\begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bc

For larger matrices, you can compute the determinant using cofactor expansion or by performing Gaussian elimination (the determinant is the product of the pivots).

What the determinant tells you:

$\det(A) \neq 0$ : $A$ is invertible, $A\mathbf{x} = \mathbf{b}$ has a unique solution.
$\det(A) = 0$ : $A$ is singular, no unique solution.
$|\det(A)|$ gives the factor by which $A$ scales areas (2D) or volumes (3D).

The determinant also connects directly to eigenvalues: the determinant of a matrix equals the product of its eigenvalues.

Summary

Concept	Key idea
$A\mathbf{x} = \mathbf{b}$	The standard form for a linear system
Inverse $A^{-1}$	Satisfies $AA^{-1} = I$ ; gives $\mathbf{x} = A^{-1}\mathbf{b}$
Determinant	Zero means singular (no inverse)
Gaussian elimination	Row operations to reach triangular form, then back-substitute
Singular matrix	Linearly dependent rows, $\det = 0$ , no unique solution

What comes next

Inverses tell you how to undo a matrix transformation. But what directions does a matrix leave unchanged? The next article covers eigenvalues and eigenvectors, one of the most powerful ideas in linear algebra and a foundation for PCA, spectral methods, and stability analysis.

← Back to all series