Chapter 4

Matrices as Transformations

You've seen 2x2 grids of numbers everywhere. In code, in textbooks, in API docs. But a matrix isn't just numbers arranged in a box -- it's a snapshot of a transformation, frozen in notation.

The last chapter showed that a linear transformation is fully determined by what it does to the basis vectors. This chapter makes that idea concrete: the basis vectors' landing spots become the columns of the matrix. That's the entire connection between "transformation" and "matrix." Once you see it, matrix-vector multiplication stops being a memorized formula and starts being obvious.

The basis vectors that define everything

Before any transformation happens, we need to see what we're working with. The standard basis vectors are ı^=(1,0)\hat{\imath} = (1, 0) and ȷ^=(0,1)\hat{\jmath} = (0, 1). Every vector in the plane is some combination of these two.

The vector (3,2)(3, 2)? That's 3ı^+2ȷ^3\hat{\imath} + 2\hat{\jmath}. The vector (1,4)(-1, 4)? That's 1ı^+4ȷ^-1\hat{\imath} + 4\hat{\jmath}. These two basis vectors are the ingredients, and the coordinates are the recipe.

1 2 3 -1 1 2 -1 (1, 0) (0, 1) (3, 2) = 3î + 2ĵ

These two basis vectors define the coordinate system. Every other vector is a combination of them.

Where do the basis vectors land?

Now apply a linear transformation. We don't need to describe what happens to every single point. We just need to say where ı^\hat{\imath} and ȷ^\hat{\jmath} end up. Everything else follows.

Suppose ı^\hat{\imath} lands at (2,1)(2, -1) and ȷ^\hat{\jmath} lands at (1,1.5)(1, 1.5). The original basis vectors are shown dashed. The new positions are shown solid. Notice how the entire grid has been warped -- but it's still made of straight, evenly-spaced lines. That's the signature of a linear transformation.

old î old ĵ î' (2, -1) ĵ' (1, 1.5)

A transformation is completely described by where ı^\hat{\imath} and ȷ^\hat{\jmath} land.

Reading the matrix

Here's the key move. Take those two landing spots and pack them into columns:

[2111.5]\begin{bmatrix} 2 & 1 \\ -1 & 1.5 \end{bmatrix}

The first column (2,1)(2, -1) is where ı^\hat{\imath} lands. The second column (1,1.5)(1, 1.5) is where ȷ^\hat{\jmath} lands. That's it. That's what a matrix is.

(2, -1) (1, 1.5) [ 2 -1 1 1.5 ] col 1 = new î col 2 = new ĵ

The first column is the new ı^\hat{\imath}. The second column is the new ȷ^\hat{\jmath}.

Every time you see a 2x2 matrix, you can read it as a transformation. The left column tells you where the horizontal basis vector goes. The right column tells you where the vertical basis vector goes. The matrix is the transformation.

Applying the matrix to a vector

Now for the payoff. Take a vector v=(1,2)\mathbf{v} = (1, 2). Before the transformation, that means v=1ı^+2ȷ^\mathbf{v} = 1\hat{\imath} + 2\hat{\jmath} -- one part ı^\hat{\imath}, two parts ȷ^\hat{\jmath}.

After the transformation, the recipe stays the same but the ingredients change. We still want one part of the first basis vector and two parts of the second. But now those basis vectors have moved:

v=1[21]+2[11.5]=[2+21+3]=[42]\mathbf{v}' = 1 \cdot \begin{bmatrix} 2 \\ -1 \end{bmatrix} + 2 \cdot \begin{bmatrix} 1 \\ 1.5 \end{bmatrix} = \begin{bmatrix} 2 + 2 \\ -1 + 3 \end{bmatrix} = \begin{bmatrix} 4 \\ 2 \end{bmatrix}

1 2 3 4 1 2 -1 v = (1, 2) 1 * (2, -1) 2 * (1, 1.5) v' = (4, 2)

Multiply = "take x of the first column, plus y of the second." That's it.

The blue arrow is 1 copy of the new ı^\hat{\imath} (which is (2,1)(2, -1)). The green arrow is 2 copies of the new ȷ^\hat{\jmath} (which is (1,1.5)(1, 1.5)). Add them up and you get the orange result: (4,2)(4, 2).

Matrix-vector multiplication isn't some arbitrary rule. It's just saying: "use the same recipe with the new ingredients."

The formal bit

Here's the general formula. A matrix times a vector:

[abcd][xy]=x[ac]+y[bd]=[ax+bycx+dy]\begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = x\begin{bmatrix} a \\ c \end{bmatrix} + y\begin{bmatrix} b \\ d \end{bmatrix} = \begin{bmatrix} ax + by \\ cx + dy \end{bmatrix}

You might have learned this as "row times column" -- multiply across, add up. That procedure gives the right answer, but it hides the meaning. The column view makes the meaning visible: xx scales the first column, yy scales the second column, and you add the results.

Think of it this way: the input coordinates (x,y)(x, y) tell you the recipe -- how much of each basis vector you need. The matrix columns are the ingredients -- the new basis vectors. Same recipe, different ingredients.

This also makes it clear why matrix-vector multiplication is linear. Scaling the input just scales the recipe. Adding two inputs adds two recipes. The structure is preserved because you're always just combining columns.

Worked example: UI scaling

Suppose you're building a drawing tool and the user drags a resize handle that doubles the width and halves the height. What matrix describes this transformation?

The horizontal basis vector (1,0)(1, 0) should map to (2,0)(2, 0) -- stretched horizontally. The vertical basis vector (0,1)(0, 1) should map to (0,0.5)(0, 0.5) -- compressed vertically. Pack those into columns:

S=[2000.5]S = \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix}

Apply this to the four corners of a unit square:

[2000.5][00]=[00][2000.5][10]=[20]\begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} 0 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \qquad \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 2 \\ 0 \end{bmatrix}

[2000.5][11]=[20.5][2000.5][01]=[00.5]\begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 2 \\ 0.5 \end{bmatrix} \qquad \begin{bmatrix} 2 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0.5 \end{bmatrix}

The unit square (0,0)(1,0)(1,1)(0,1)(0,0) \to (1,0) \to (1,1) \to (0,1) becomes the rectangle (0,0)(2,0)(2,0.5)(0,0.5)(0,0) \to (2,0) \to (2,0.5) \to (0,0.5).

1 2 3 1 2 original transformed î' = (2, 0) ĵ' = (0, 0.5) (2, 0) (2, 0.5) (0, 0.5) S = [ 2 0 0 0.5 ]

The unit square stretched to double width, half height. A diagonal matrix scales each axis independently.

Notice something about that matrix: it's diagonal. When a matrix is diagonal, the axes don't mix. xx only affects xx, and yy only affects yy. That's because the new ı^\hat{\imath} still points along the xx-axis (just longer), and the new ȷ^\hat{\jmath} still points along the yy-axis (just shorter). Diagonal matrices are pure scaling -- no rotation, no shearing.

In code, this is the matrix you'd pass to a transform function when a user resizes an element. And now you know exactly what those four numbers mean.

Key Takeaway: A matrix is a transformation written down. Its columns tell you where the basis vectors land. Matrix-vector multiplication is just a linear combination of those columns -- same recipe, different ingredients.

What's next

We can describe a single transformation as a matrix. What happens when we apply one transformation after another? You might guess we multiply the matrices -- and you'd be right. But why does matrix multiplication work the way it does, and why does the order matter? That's Chapter 5: Matrix Multiplication -- composition of transformations.