Chapter 12

Change of Basis

The same vector looks different depending on whose coordinate system you're using. How do you translate between viewpoints?

When you write a vector as (3,1)(3, 1), you're saying "3 units along the x-axis, 1 unit along the y-axis." But that's a choice. Someone else might use a different pair of reference directions -- rotated, scaled, skewed -- and they'd describe the exact same arrow in space with completely different numbers. Neither description is wrong. They're just using different coordinates.

This chapter is about the translation layer between those descriptions. The matrix that converts coordinates from one basis to another is called the change of basis matrix, and the formula P1APP^{-1}AP lets you express the same transformation from a different point of view. If you've ever converted between world space and camera space in a graphics engine, you've already done this.

Two coordinate systems

Let's set up two different ways to describe points in the plane. The standard basis uses the usual horizontal and vertical axes: ı^=(1,0)\hat{\imath} = (1, 0) and ȷ^=(0,1)\hat{\jmath} = (0, 1). That's the gray grid below.

Now consider an alternative basis with vectors b1=(1,1)\vec{b}_1 = (1, 1) and b2=(1,1)\vec{b}_2 = (-1, 1). These are rotated 45 degrees from the standard axes. The green grid below shows the coordinate lines of this alternative system.

Take the vector that we call (3,1)(3, 1) in the standard basis. In the alternative basis, that same arrow is described as (2,1)(2, -1) -- meaning 2 units along b1\vec{b}_1 and 1-1 unit along b2\vec{b}_2. Different numbers, same arrow.

b₁ (1, 1) b₂ (-1, 1) (3, 1) standard (2, -1) alternative 2b₁ -1b₂ 1 2 3 4 1 2 3

The gray grid is the standard coordinate system. The green grid is the alternative basis -- rotated 45 degrees. The same vector is (3,1)(3, 1) in standard coordinates or (2,1)(2, -1) in the alternative basis. The dashed lines show the decomposition: 2 units of b1\vec{b}_1 plus 1-1 units of b2\vec{b}_2.

Verify by hand: 2b1+(1)b2=2(1,1)+(1)(1,1)=(2,2)+(1,1)=(3,1)2\vec{b}_1 + (-1)\vec{b}_2 = 2(1, 1) + (-1)(-1, 1) = (2, 2) + (1, -1) = (3, 1). Same arrow, different description.

The change of basis matrix P

The matrix that translates between these two descriptions is built from the alternative basis vectors. Pack b1\vec{b}_1 and b2\vec{b}_2 as the columns of a matrix PP:

P=[1111]P = \begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix}

This matrix converts coordinates from the alternative basis to the standard basis. If a vector has coordinates (2,1)(2, -1) in the alternative system, then:

P[21]=2[11]+(1)[11]=[31]P\begin{bmatrix} 2 \\ -1 \end{bmatrix} = 2\begin{bmatrix} 1 \\ 1 \end{bmatrix} + (-1)\begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix} 3 \\ 1 \end{bmatrix}

That's just the definition of what "coordinates in a basis" means -- a linear combination of the basis vectors. The matrix PP packages that linear combination into a single multiplication.

To go the other direction -- standard to alternative -- you use the inverse:

P1=12[1111]P^{-1} = \frac{1}{2}\begin{bmatrix} 1 & 1 \\ -1 & 1 \end{bmatrix}

P1[31]=12[3+13+1]=[21]P^{-1}\begin{bmatrix} 3 \\ 1 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 3 + 1 \\ -3 + 1 \end{bmatrix} = \begin{bmatrix} 2 \\ -1 \end{bmatrix}

Standard coordinates go in, alternative coordinates come out.

b₁ = (1, 1) column 1 of P b₂ = (-1, 1) column 2 of P P = [ 1 -1 1 1 ] P converts: alternative coords standard coords P⁻¹ converts: standard coords alternative coords Example: P (2, -1) = (3, 1) P⁻¹ (3, 1) = (2, -1) v = (3, 1)ₛₜₑ v = (2, -1)ₐₗₜ 1 2 3 1 2 3

The columns of PP are the alternative basis vectors expressed in the standard basis. Multiplying PP by a vector in alternative coordinates gives you that same vector in standard coordinates. The inverse P1P^{-1} goes the other way.

Think of PP as a translator. It doesn't change the vector -- it changes the language used to describe it. The arrow in space stays put. Only the numbers change.

A transformation in another basis

Now here's where change of basis gets genuinely powerful. Suppose you have a transformation described by a matrix AA in the standard basis. What does that same transformation look like to someone using the alternative basis?

Let's use a horizontal stretch by a factor of 2:

A=[2001]A = \begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix}

In the standard basis, this is simple: double the xx-coordinate, leave the yy-coordinate alone. But in the alternative basis -- the one rotated 45 degrees -- this transformation looks different. It's no longer a simple axis-aligned stretch because the alternative axes don't line up with the stretch direction.

To find the matrix AA' that describes the same transformation in the alternative basis, you use the similarity formula:

A=P1APA' = P^{-1}AP

Here's what each step does to a vector in alternative coordinates:

  1. PP: Translate to standard coordinates (the only language AA speaks)
  2. AA: Apply the transformation (in standard coordinates)
  3. P1P^{-1}: Translate the result back to alternative coordinates
Alt coords Std coords Std coords Alt coords (2, -1) input P (3, 1) translated A (6, 1) transformed P⁻¹ (3.5, -2.5) result (3, 1) (6, 1) A: stretch x by 2 1 2 3 4 5 6 1 2

The pipeline: start with alternative coordinates (2,1)(2, -1), convert to standard with PP, apply the transformation AA, convert back with P1P^{-1}. Below, the actual geometric effect: the vector (3,1)(3, 1) gets stretched horizontally to (6,1)(6, 1). The geometry is the same no matter which coordinate system describes it.

Let's compute AA' explicitly:

A=P1AP=12[1111][2001][1111]A' = P^{-1}AP = \frac{1}{2}\begin{bmatrix} 1 & 1 \\ -1 & 1 \end{bmatrix}\begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix}

First, APAP:

AP=[2001][1111]=[2211]AP = \begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} 1 & -1 \\ 1 & 1 \end{bmatrix} = \begin{bmatrix} 2 & -2 \\ 1 & 1 \end{bmatrix}

Then, P1(AP)P^{-1}(AP):

A=12[1111][2211]=12[3113]=[1.50.50.51.5]A' = \frac{1}{2}\begin{bmatrix} 1 & 1 \\ -1 & 1 \end{bmatrix}\begin{bmatrix} 2 & -2 \\ 1 & 1 \end{bmatrix} = \frac{1}{2}\begin{bmatrix} 3 & -1 \\ -1 & 3 \end{bmatrix} = \begin{bmatrix} 1.5 & -0.5 \\ -0.5 & 1.5 \end{bmatrix}

In the standard basis, the transformation is the clean [2001]\begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix} -- a simple horizontal stretch. In the rotated basis, the same transformation becomes [1.50.50.51.5]\begin{bmatrix} 1.5 & -0.5 \\ -0.5 & 1.5 \end{bmatrix} -- no longer diagonal, because the alternative axes aren't aligned with the stretch direction. The geometry hasn't changed. The description has.

Same transformation, different descriptions

This is the key point: AA and A=P1APA' = P^{-1}AP describe the same geometric transformation. They're the same function written in different coordinate languages. Let's see this directly.

Apply AA to the vector (3,1)(3, 1) in standard coordinates:

A[31]=[2001][31]=[61]A\begin{bmatrix} 3 \\ 1 \end{bmatrix} = \begin{bmatrix} 2 & 0 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} 3 \\ 1 \end{bmatrix} = \begin{bmatrix} 6 \\ 1 \end{bmatrix}

Apply AA' to the same vector written as (2,1)(2, -1) in alternative coordinates:

A[21]=[1.50.50.51.5][21]=[3.52.5]A'\begin{bmatrix} 2 \\ -1 \end{bmatrix} = \begin{bmatrix} 1.5 & -0.5 \\ -0.5 & 1.5 \end{bmatrix}\begin{bmatrix} 2 \\ -1 \end{bmatrix} = \begin{bmatrix} 3.5 \\ -2.5 \end{bmatrix}

Now convert that result back to standard coordinates: P(3.5,2.5)=3.5(1,1)+(2.5)(1,1)=(3.5,3.5)+(2.5,2.5)=(6,1)P(3.5, -2.5) = 3.5(1,1) + (-2.5)(-1,1) = (3.5, 3.5) + (2.5, -2.5) = (6, 1).

Same answer. Both paths land at the same geometric point.

Standard basis Transformation A (3, 1) (6, 1) A 1 2 3 4 5 Alternative basis Transformation A' = P⁻¹AP (2, -1) (3.5, -2.5) A' A [ 2 0 0 1 ] A' [ 1.5 -0.5 -0.5 1.5 ] Same transformation. Different matrices. Same geometric result.

Left: the transformation A stretches horizontally, taking (3,1)(3, 1) to (6,1)(6, 1). Right: the same arrow, described in the alternative basis, goes from (2,1)(2, -1) to (3.5,2.5)(3.5, -2.5) under AA'. Both paths arrive at the same point in space. The matrices are different because the coordinate systems are different.

Two matrices related by A=P1APA' = P^{-1}AP are called similar matrices. They're not the same matrix, but they represent the same underlying transformation. Think of it like the number "five" written as 5, V, 101 in binary, or cinco -- different representations of the same thing.

The formal bit

Here are the key formulas, stated precisely.

Change of basis matrix: If {b1,b2,,bn}\{\vec{b}_1, \vec{b}_2, \ldots, \vec{b}_n\} is an alternative basis, the change of basis matrix is:

P=[b1b2bn]P = \begin{bmatrix} | & | & & | \\ \vec{b}_1 & \vec{b}_2 & \cdots & \vec{b}_n \\ | & | & & | \end{bmatrix}

Its columns are the alternative basis vectors expressed in the standard basis.

Converting coordinates:

vstandard=Pvaltvalt=P1vstandard\vec{v}_{\text{standard}} = P\vec{v}_{\text{alt}} \qquad \vec{v}_{\text{alt}} = P^{-1}\vec{v}_{\text{standard}}

PP goes from alternative to standard. P1P^{-1} goes from standard to alternative.

Similarity transform -- expressing a transformation in a different basis:

A=P1APA' = P^{-1}AP

AA is the matrix of the transformation in the standard basis. AA' is the matrix of the same transformation in the alternative basis. They describe the same geometric operation.

Why it works: For any vector valt\vec{v}_{\text{alt}} in alternative coordinates:

A(valt)=P1AP(valt)A'(\vec{v}_{\text{alt}}) = P^{-1}AP(\vec{v}_{\text{alt}})

Reading right to left: PP translates to standard coordinates, AA applies the transformation, P1P^{-1} translates the result back to alternative coordinates. The sandwich structure ensures the transformation is done in the language AA understands (standard coordinates), with automatic translation on both ends.

Worked example: world space to camera space

Here's where this becomes immediately useful. In 3D graphics, every object lives in world space -- a single global coordinate system. But the screen shows what the camera sees, and the camera has its own coordinate system defined by three vectors:

These three vectors form a basis -- the camera's basis. The change of basis matrix PP has these vectors as its columns:

P=[rightuplook]P = \begin{bmatrix} | & | & | \\ \text{right} & \text{up} & \text{look} \\ | & | & | \end{bmatrix}

To convert a point from world coordinates to camera coordinates, you multiply by P1P^{-1}:

vcamera=P1vworld\vec{v}_{\text{camera}} = P^{-1}\vec{v}_{\text{world}}

Here's what this looks like in code:

import numpy as np

# Camera is at the origin, looking along (0.707, 0, 0.707)
# tilted 45 degrees from the z-axis
look  = np.array([0.707, 0, 0.707])
right = np.array([0.707, 0, -0.707])
up    = np.array([0, 1, 0])

# Build the change of basis matrix
P = np.column_stack([right, up, look])

# A point in world space
point_world = np.array([10, 5, 10])

# Convert to camera space
P_inv = np.linalg.inv(P)
point_camera = P_inv @ point_world
print(point_camera)  # [14.14, 5.0, 0.0]

The world point (10,5,10)(10, 5, 10) is at (14.14,5,0)(14.14, 5, 0) in camera coordinates. The camera is looking right at it (the look component is 0), and it's 14.14 units to the right and 5 units up from the camera's viewpoint.

This is exactly the pattern: P1P^{-1} converts from one coordinate system to another. In game engines, this matrix is computed every frame as the camera moves. Every vertex in the scene gets multiplied by it. The geometry of the world doesn't change -- only the description, as seen from the camera's perspective.

If you want to apply a transformation (like lighting or physics) that's naturally described in camera space, you'd use the full P1APP^{-1}AP pattern: convert to camera space, apply the transformation, convert back to world space.

Key Takeaway: Change of basis is P1APP^{-1}AP. It expresses the same transformation from a different point of view. The geometry doesn't change -- only the description. The matrix PP (whose columns are the new basis vectors) translates between coordinate systems, and the similarity transform A=P1APA' = P^{-1}AP rewrites a transformation in the new language.

What's next

Some transformations become beautifully simple in the right basis. A messy matrix full of off-diagonal entries might secretly be just a scaling operation -- if you look at it from the right coordinate system. The vectors that define that perfect basis, the one where the transformation becomes pure scaling along each axis, are called eigenvectors. Finding them is one of the most important ideas in all of linear algebra. That's Chapter 13: Eigenvectors & Eigenvalues.