Chapter 16

2D Transformations

Rotation, scaling, shearing, reflection -- you've met them individually. Now let's see them as matrices, combine them, and build real 2D graphics pipelines.

Every 2D linear transformation can be written as a 2x2 matrix. We've seen this abstractly: the columns of the matrix are where the basis vectors land. But when you're building a game engine, a UI framework, or a data visualization, you need the specific matrices for specific operations. What's the matrix for a 45-degree rotation? For flipping an image? For stretching a sprite wider without making it taller?

This chapter gives you the toolbox. Each transformation gets its own matrix, its own visual, and its own formula. Then we combine them -- because real graphics pipelines chain multiple transformations together, and the combined effect is a single matrix you can apply to every vertex at once.

We'll use a letter "F" as our test shape throughout. The F is perfect for this because it's asymmetric -- you can immediately tell when it's been rotated, flipped, or sheared. A circle or a square would hide half the transformation from you.

Rotation

The rotation matrix for angle $\theta$ (counterclockwise) is:

R(\theta) = \begin{bmatrix}\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta\end{bmatrix}

Where does this come from? Rotating $\hat{\imath} = (1, 0)$ by $\theta$ gives $(\cos\theta, \sin\theta)$ . Rotating $\hat{\jmath} = (0, 1)$ by $\theta$ gives $(-\sin\theta, \cos\theta)$ . Those become the columns of the matrix. That's it -- the matrix is just "where do the basis vectors land?"

For $\theta = 45°$ , we have $\cos 45° = \sin 45° \approx 0.707$ :

R(45°) = \begin{bmatrix}0.707 & -0.707 \\ 0.707 & 0.707\end{bmatrix}

Watch the F rotate 45 degrees counterclockwise. The original is dashed, the rotated version is solid.

The dashed F is the original. The solid F has been rotated 45 degrees counterclockwise. Every point swings along a circular arc around the origin. The basis vectors $\hat{\imath}$ and $\hat{\jmath}$ both rotate by the same angle, staying perpendicular and unit-length.

Notice what rotation preserves: lengths, angles, and the shape of the F. The determinant of any rotation matrix is $\cos^2\theta + \sin^2\theta = 1$ -- no area change. Rotation is the most "well-behaved" transformation.

Scaling

Non-uniform scaling stretches space by different amounts along each axis. The matrix is:

S(s_x, s_y) = \begin{bmatrix}s_x & 0 \\ 0 & s_y\end{bmatrix}

Each basis vector just gets multiplied by its corresponding scale factor. $\hat{\imath}$ becomes $(s_x, 0)$ and $\hat{\jmath}$ becomes $(0, s_y)$ . The zeros mean the axes don't tilt -- they stay aligned with the coordinate axes.

Let's scale by $s_x = 2$ horizontally and $s_y = 0.6$ vertically. The F gets stretched wide and squished short.

The F is stretched to double its width but squished to 60% of its height. Horizontal proportions are preserved -- the crossbars still look right -- but the overall aspect ratio is distorted. The determinant is $2 \times 0.6 = 1.2$ , so area increases by 20%.

The diagonal structure of the scaling matrix is what keeps the axes aligned. If you scale uniformly ( $s_x = s_y = s$ ), the matrix becomes $sI$ and every direction gets scaled equally -- a pure zoom.

Shear

A horizontal shear tilts the vertical axis while keeping the horizontal axis fixed. The matrix is:

H(k) = \begin{bmatrix}1 & k \\ 0 & 1\end{bmatrix}

The basis vector $\hat{\imath}$ doesn't move (column 1 is $(1, 0)$ ). But $\hat{\jmath}$ slides sideways to $(k, 1)$ -- still unit height, but shifted horizontally by $k$ . Points higher up get pushed further to the right.

With $k = 0.8$ , every point at height $y$ slides right by $0.8y$ :

The bottom of the F stays put (points at $y = 0$ don't move). Points higher up slide further right. The F looks italicized -- that's exactly what italic text is, geometrically: a shear. The dashed lines show how much each height level has shifted.

Shearing is subtle because it preserves area -- the determinant of a shear matrix is always 1. The shape is distorted, but the total area it covers doesn't change. This makes shears invisible to determinant-based measurements. You need to look at the shape itself to see that angles have changed.

Reflection

Reflection across the $y$ -axis negates the $x$ -coordinate of every point. The matrix is:

M_y = \begin{bmatrix}-1 & 0 \\ 0 & 1\end{bmatrix}

This sends $\hat{\imath}$ to $(-1, 0)$ -- it flips the horizontal axis -- while leaving $\hat{\jmath}$ unchanged. The result is a mirror image: left becomes right, right becomes left.

The F is mirrored across the $y$ -axis. The crossbars now point left instead of right. The dashed lines connect corresponding points, showing equal distances from the mirror line. This is the visual signature of a negative determinant: orientation is reversed.

The determinant of the reflection matrix is $-1$ . That negative sign is the mathematical fingerprint of a reflection: it tells you the transformation reverses orientation. A clockwise-wound polygon becomes counterclockwise. In graphics programming, this flips the winding order of triangles, which is how renderers decide which face of a surface you're looking at.

You can reflect across any line through the origin. Reflection across the $x$ -axis uses $\begin{bmatrix}1 & 0 \\ 0 & -1\end{bmatrix}$ . Reflection across the line $y = x$ uses $\begin{bmatrix}0 & 1 \\ 1 & 0\end{bmatrix}$ . Each one has determinant $-1$ .

Composition: scale then rotate

The real power comes from combining transformations. In practice, you rarely apply just one. A game sprite needs to be scaled to the right size, rotated to face the right direction, and (eventually, with homogeneous coordinates) translated to the right position.

Let's scale the F by $(1.5, 1.5)$ -- uniform scaling to 150% -- and then rotate it by 30 degrees. We do the scaling first, then the rotation. In matrix notation, that means $R(30°) \cdot S(1.5, 1.5)$ -- the rightmost matrix is applied first.

S = \begin{bmatrix}1.5 & 0 \\ 0 & 1.5\end{bmatrix}, \quad R = \begin{bmatrix}0.866 & -0.5 \\ 0.5 & 0.866\end{bmatrix}

The combined matrix:

R \cdot S = \begin{bmatrix}0.866 & -0.5 \\ 0.5 & 0.866\end{bmatrix}\begin{bmatrix}1.5 & 0 \\ 0 & 1.5\end{bmatrix} = \begin{bmatrix}1.30 & -0.75 \\ 0.75 & 1.30\end{bmatrix}

Three stages: the faint dashed F is the original. The orange dashed outline shows the intermediate result after scaling by 1.5. The solid blue F is the final result after both scaling and rotating 30 degrees. The combined matrix in the box represents both operations as a single transformation.

The key insight: you can pre-compute that combined matrix once and then apply it to every vertex of your shape. Whether your sprite has 4 vertices or 4000, you multiply each one by the same $2 \times 2$ matrix. This is why game engines and graphics APIs work with transformation matrices -- they collapse an entire pipeline of operations into a single matrix multiply per vertex.

Remember: matrices compose right to left. The matrix on the right acts first. $R \cdot S$ means "scale first ( $S$ ), then rotate ( $R$ )." If you did $S \cdot R$ instead, you'd rotate first and then scale, which gives a different result (as we saw in the matrix multiplication chapter).

The formal bit

Every 2D linear transformation corresponds to a $2 \times 2$ matrix. Here are the essential ones:

Rotation by angle $\theta$ counterclockwise:

R(\theta) = \begin{bmatrix}\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta\end{bmatrix}, \quad \det = 1

Scaling by factors $s_x$ and $s_y$ :

S(s_x, s_y) = \begin{bmatrix}s_x & 0 \\ 0 & s_y\end{bmatrix}, \quad \det = s_x \cdot s_y

Horizontal shear by factor $k$ :

H(k) = \begin{bmatrix}1 & k \\ 0 & 1\end{bmatrix}, \quad \det = 1

Reflection across the $y$ -axis:

M_y = \begin{bmatrix}-1 & 0 \\ 0 & 1\end{bmatrix}, \quad \det = -1

Some important patterns:

Rotations and shears preserve area ( $\det = 1$ ), but shears change angles while rotations don't.
Scaling changes area by a factor of $|s_x \cdot s_y|$ .
Reflections have $\det = -1$ , signaling orientation reversal.
Composing any of these (by matrix multiplication) gives another $2 \times 2$ matrix. The resulting transformation is always linear -- grid lines stay straight and evenly spaced, and the origin stays fixed.
Composition is not commutative. $R \cdot S \neq S \cdot R$ in general. The order you apply transformations matters.

One more pattern worth noting: rotation matrices are orthogonal (their columns are perpendicular unit vectors, and $R^T = R^{-1}$ ). Undoing a rotation is the same as transposing its matrix -- no expensive inverse computation needed.

Worked example: building a sprite transform pipeline

You're building a 2D game. An enemy spaceship sprite needs to be positioned on screen. The sprite's vertices are defined in "model space" -- a local coordinate system where the ship is centered at the origin, pointing right, with a wingspan of 2 units.

The game logic says: this frame, the ship should be at position $(5, 3)$ , rotated 60 degrees, and scaled to 80% size. You need to transform every vertex from model space to screen space.

Step 1: Scale to 80%.

S = \begin{bmatrix}0.8 & 0 \\ 0 & 0.8\end{bmatrix}

This shrinks the ship uniformly. A wingtip at $(1, 0.5)$ moves to $(0.8, 0.4)$ .

Step 2: Rotate 60 degrees ( $\cos 60° = 0.5$ , $\sin 60° = 0.866$ ).

R = \begin{bmatrix}0.5 & -0.866 \\ 0.866 & 0.5\end{bmatrix}

Step 3: Compose. The combined scale-then-rotate matrix is $R \cdot S$ :

M = R \cdot S = \begin{bmatrix}0.5 & -0.866 \\ 0.866 & 0.5\end{bmatrix}\begin{bmatrix}0.8 & 0 \\ 0 & 0.8\end{bmatrix} = \begin{bmatrix}0.4 & -0.693 \\ 0.693 & 0.4\end{bmatrix}

Step 4: Apply to every vertex. For the wingtip at $(1, 0.5)$ :

M\begin{bmatrix}1 \\ 0.5\end{bmatrix} = \begin{bmatrix}0.4 \cdot 1 + (-0.693) \cdot 0.5 \\ 0.693 \cdot 1 + 0.4 \cdot 0.5\end{bmatrix} = \begin{bmatrix}0.054 \\ 0.893\end{bmatrix}

The wingtip moved from $(1, 0.5)$ to approximately $(0.05, 0.89)$ -- scaled down and rotated.

Step 5: Translation (adding position $(5, 3)$ ) can't be done with a $2 \times 2$ matrix -- that's a limitation we'll solve in the next chapter. For now, you'd add the position vector after the matrix multiply:

function transformSprite(vertices, scaleMatrix, rotMatrix, position) {
  // Compose once
  const M = multiply2x2(rotMatrix, scaleMatrix);

  // Apply to all vertices
  return vertices.map(v => {
    const transformed = multiply2x2byVec(M, v);
    return [transformed[0] + position[0],
            transformed[1] + position[1]];
  });
}

The matrix multiplication happens once. The per-vertex cost is a single matrix-vector multiply plus a vector addition. With 100 vertices, you do 1 matrix multiply and 100 cheap transforms instead of 200 separate operations.

Key Takeaway: Every 2D linear transformation is a 2x2 matrix. Rotation, scaling, shearing, and reflection each have a clean matrix form. Combining transforms means multiplying matrices -- right to left. Pre-compute the combined matrix once, then apply it to every vertex. This is the foundation of every 2D graphics pipeline.

What's next

There's one essential operation missing from our toolbox: translation -- moving an object to a different position. Look at step 5 of our worked example: we had to add the position separately because a $2 \times 2$ matrix can't move the origin. Every linear transformation fixes the origin in place.

Homogeneous coordinates solve this elegantly. By embedding 2D points into 3D space (adding a third coordinate that's always 1), translation becomes a matrix multiplication too. Then all transforms -- scale, rotate, shear, reflect, and translate -- can be composed into a single $3 \times 3$ matrix. One multiply per vertex handles everything.