Chapter 16

2D Transformations

Rotation, scaling, shearing, reflection -- you've met them individually. Now let's see them as matrices, combine them, and build real 2D graphics pipelines.

Every 2D linear transformation can be written as a 2x2 matrix. We've seen this abstractly: the columns of the matrix are where the basis vectors land. But when you're building a game engine, a UI framework, or a data visualization, you need the specific matrices for specific operations. What's the matrix for a 45-degree rotation? For flipping an image? For stretching a sprite wider without making it taller?

This chapter gives you the toolbox. Each transformation gets its own matrix, its own visual, and its own formula. Then we combine them -- because real graphics pipelines chain multiple transformations together, and the combined effect is a single matrix you can apply to every vertex at once.

We'll use a letter "F" as our test shape throughout. The F is perfect for this because it's asymmetric -- you can immediately tell when it's been rotated, flipped, or sheared. A circle or a square would hide half the transformation from you.

Rotation

The rotation matrix for angle θ\theta (counterclockwise) is:

R(θ)=[cosθsinθsinθcosθ]R(\theta) = \begin{bmatrix}\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta\end{bmatrix}

Where does this come from? Rotating ı^=(1,0)\hat{\imath} = (1, 0) by θ\theta gives (cosθ,sinθ)(\cos\theta, \sin\theta). Rotating ȷ^=(0,1)\hat{\jmath} = (0, 1) by θ\theta gives (sinθ,cosθ)(-\sin\theta, \cos\theta). Those become the columns of the matrix. That's it -- the matrix is just "where do the basis vectors land?"

For θ=45°\theta = 45°, we have cos45°=sin45°0.707\cos 45° = \sin 45° \approx 0.707:

R(45°)=[0.7070.7070.7070.707]R(45°) = \begin{bmatrix}0.707 & -0.707 \\ 0.707 & 0.707\end{bmatrix}

Watch the F rotate 45 degrees counterclockwise. The original is dashed, the rotated version is solid.

original F rotated 45° i' j' 45° R(45°) [0.71, -0.71] [0.71, 0.71]

The dashed F is the original. The solid F has been rotated 45 degrees counterclockwise. Every point swings along a circular arc around the origin. The basis vectors ı^\hat{\imath} and ȷ^\hat{\jmath} both rotate by the same angle, staying perpendicular and unit-length.

Notice what rotation preserves: lengths, angles, and the shape of the F. The determinant of any rotation matrix is cos2θ+sin2θ=1\cos^2\theta + \sin^2\theta = 1 -- no area change. Rotation is the most "well-behaved" transformation.

Scaling

Non-uniform scaling stretches space by different amounts along each axis. The matrix is:

S(sx,sy)=[sx00sy]S(s_x, s_y) = \begin{bmatrix}s_x & 0 \\ 0 & s_y\end{bmatrix}

Each basis vector just gets multiplied by its corresponding scale factor. ı^\hat{\imath} becomes (sx,0)(s_x, 0) and ȷ^\hat{\jmath} becomes (0,sy)(0, s_y). The zeros mean the axes don't tilt -- they stay aligned with the coordinate axes.

Let's scale by sx=2s_x = 2 horizontally and sy=0.6s_y = 0.6 vertically. The F gets stretched wide and squished short.

original scaled: 2x wide, 0.6x tall i' = (2, 0) j' = (0, 0.6) S(2, 0.6) [2, 0] [0, 0.6]

The F is stretched to double its width but squished to 60% of its height. Horizontal proportions are preserved -- the crossbars still look right -- but the overall aspect ratio is distorted. The determinant is 2×0.6=1.22 \times 0.6 = 1.2, so area increases by 20%.

The diagonal structure of the scaling matrix is what keeps the axes aligned. If you scale uniformly (sx=sy=ss_x = s_y = s), the matrix becomes sIsI and every direction gets scaled equally -- a pure zoom.

Shear

A horizontal shear tilts the vertical axis while keeping the horizontal axis fixed. The matrix is:

H(k)=[1k01]H(k) = \begin{bmatrix}1 & k \\ 0 & 1\end{bmatrix}

The basis vector ı^\hat{\imath} doesn't move (column 1 is (1,0)(1, 0)). But ȷ^\hat{\jmath} slides sideways to (k,1)(k, 1) -- still unit height, but shifted horizontally by kk. Points higher up get pushed further to the right.

With k=0.8k = 0.8, every point at height yy slides right by 0.8y0.8y:

original sheared F i' = (1, 0) j' = (0.8, 1) +0.8 +1.6 +2.4 H(0.8) [1, 0.8] [0, 1 ]

The bottom of the F stays put (points at y=0y = 0 don't move). Points higher up slide further right. The F looks italicized -- that's exactly what italic text is, geometrically: a shear. The dashed lines show how much each height level has shifted.

Shearing is subtle because it preserves area -- the determinant of a shear matrix is always 1. The shape is distorted, but the total area it covers doesn't change. This makes shears invisible to determinant-based measurements. You need to look at the shape itself to see that angles have changed.

Reflection

Reflection across the yy-axis negates the xx-coordinate of every point. The matrix is:

My=[1001]M_y = \begin{bmatrix}-1 & 0 \\ 0 & 1\end{bmatrix}

This sends ı^\hat{\imath} to (1,0)(-1, 0) -- it flips the horizontal axis -- while leaving ȷ^\hat{\jmath} unchanged. The result is a mirror image: left becomes right, right becomes left.

mirror original reflected F i' = (-1, 0) j' = (0, 1) Reflect across y [-1, 0] [ 0, 1]

The F is mirrored across the yy-axis. The crossbars now point left instead of right. The dashed lines connect corresponding points, showing equal distances from the mirror line. This is the visual signature of a negative determinant: orientation is reversed.

The determinant of the reflection matrix is 1-1. That negative sign is the mathematical fingerprint of a reflection: it tells you the transformation reverses orientation. A clockwise-wound polygon becomes counterclockwise. In graphics programming, this flips the winding order of triangles, which is how renderers decide which face of a surface you're looking at.

You can reflect across any line through the origin. Reflection across the xx-axis uses [1001]\begin{bmatrix}1 & 0 \\ 0 & -1\end{bmatrix}. Reflection across the line y=xy = x uses [0110]\begin{bmatrix}0 & 1 \\ 1 & 0\end{bmatrix}. Each one has determinant 1-1.

Composition: scale then rotate

The real power comes from combining transformations. In practice, you rarely apply just one. A game sprite needs to be scaled to the right size, rotated to face the right direction, and (eventually, with homogeneous coordinates) translated to the right position.

Let's scale the F by (1.5,1.5)(1.5, 1.5) -- uniform scaling to 150% -- and then rotate it by 30 degrees. We do the scaling first, then the rotation. In matrix notation, that means R(30°)S(1.5,1.5)R(30°) \cdot S(1.5, 1.5) -- the rightmost matrix is applied first.

S=[1.5001.5],R=[0.8660.50.50.866]S = \begin{bmatrix}1.5 & 0 \\ 0 & 1.5\end{bmatrix}, \quad R = \begin{bmatrix}0.866 & -0.5 \\ 0.5 & 0.866\end{bmatrix}

The combined matrix:

RS=[0.8660.50.50.866][1.5001.5]=[1.300.750.751.30]R \cdot S = \begin{bmatrix}0.866 & -0.5 \\ 0.5 & 0.866\end{bmatrix}\begin{bmatrix}1.5 & 0 \\ 0 & 1.5\end{bmatrix} = \begin{bmatrix}1.30 & -0.75 \\ 0.75 & 1.30\end{bmatrix}

original step 1: scaled 1.5x scale, then rotate 30° i' j' R(30°) . S(1.5) [1.30, -0.75] [0.75, 1.30] det = 2.25 (area x2.25)

Three stages: the faint dashed F is the original. The orange dashed outline shows the intermediate result after scaling by 1.5. The solid blue F is the final result after both scaling and rotating 30 degrees. The combined matrix in the box represents both operations as a single transformation.

The key insight: you can pre-compute that combined matrix once and then apply it to every vertex of your shape. Whether your sprite has 4 vertices or 4000, you multiply each one by the same 2×22 \times 2 matrix. This is why game engines and graphics APIs work with transformation matrices -- they collapse an entire pipeline of operations into a single matrix multiply per vertex.

Remember: matrices compose right to left. The matrix on the right acts first. RSR \cdot S means "scale first (SS), then rotate (RR)." If you did SRS \cdot R instead, you'd rotate first and then scale, which gives a different result (as we saw in the matrix multiplication chapter).

The formal bit

Every 2D linear transformation corresponds to a 2×22 \times 2 matrix. Here are the essential ones:

Rotation by angle θ\theta counterclockwise:

R(θ)=[cosθsinθsinθcosθ],det=1R(\theta) = \begin{bmatrix}\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta\end{bmatrix}, \quad \det = 1

Scaling by factors sxs_x and sys_y:

S(sx,sy)=[sx00sy],det=sxsyS(s_x, s_y) = \begin{bmatrix}s_x & 0 \\ 0 & s_y\end{bmatrix}, \quad \det = s_x \cdot s_y

Horizontal shear by factor kk:

H(k)=[1k01],det=1H(k) = \begin{bmatrix}1 & k \\ 0 & 1\end{bmatrix}, \quad \det = 1

Reflection across the yy-axis:

My=[1001],det=1M_y = \begin{bmatrix}-1 & 0 \\ 0 & 1\end{bmatrix}, \quad \det = -1

Some important patterns:

One more pattern worth noting: rotation matrices are orthogonal (their columns are perpendicular unit vectors, and RT=R1R^T = R^{-1}). Undoing a rotation is the same as transposing its matrix -- no expensive inverse computation needed.

Worked example: building a sprite transform pipeline

You're building a 2D game. An enemy spaceship sprite needs to be positioned on screen. The sprite's vertices are defined in "model space" -- a local coordinate system where the ship is centered at the origin, pointing right, with a wingspan of 2 units.

The game logic says: this frame, the ship should be at position (5,3)(5, 3), rotated 60 degrees, and scaled to 80% size. You need to transform every vertex from model space to screen space.

Step 1: Scale to 80%.

S=[0.8000.8]S = \begin{bmatrix}0.8 & 0 \\ 0 & 0.8\end{bmatrix}

This shrinks the ship uniformly. A wingtip at (1,0.5)(1, 0.5) moves to (0.8,0.4)(0.8, 0.4).

Step 2: Rotate 60 degrees (cos60°=0.5\cos 60° = 0.5, sin60°=0.866\sin 60° = 0.866).

R=[0.50.8660.8660.5]R = \begin{bmatrix}0.5 & -0.866 \\ 0.866 & 0.5\end{bmatrix}

Step 3: Compose. The combined scale-then-rotate matrix is RSR \cdot S:

M=RS=[0.50.8660.8660.5][0.8000.8]=[0.40.6930.6930.4]M = R \cdot S = \begin{bmatrix}0.5 & -0.866 \\ 0.866 & 0.5\end{bmatrix}\begin{bmatrix}0.8 & 0 \\ 0 & 0.8\end{bmatrix} = \begin{bmatrix}0.4 & -0.693 \\ 0.693 & 0.4\end{bmatrix}

Step 4: Apply to every vertex. For the wingtip at (1,0.5)(1, 0.5):

M[10.5]=[0.41+(0.693)0.50.6931+0.40.5]=[0.0540.893]M\begin{bmatrix}1 \\ 0.5\end{bmatrix} = \begin{bmatrix}0.4 \cdot 1 + (-0.693) \cdot 0.5 \\ 0.693 \cdot 1 + 0.4 \cdot 0.5\end{bmatrix} = \begin{bmatrix}0.054 \\ 0.893\end{bmatrix}

The wingtip moved from (1,0.5)(1, 0.5) to approximately (0.05,0.89)(0.05, 0.89) -- scaled down and rotated.

Step 5: Translation (adding position (5,3)(5, 3)) can't be done with a 2×22 \times 2 matrix -- that's a limitation we'll solve in the next chapter. For now, you'd add the position vector after the matrix multiply:

function transformSprite(vertices, scaleMatrix, rotMatrix, position) {
  // Compose once
  const M = multiply2x2(rotMatrix, scaleMatrix);

  // Apply to all vertices
  return vertices.map(v => {
    const transformed = multiply2x2byVec(M, v);
    return [transformed[0] + position[0],
            transformed[1] + position[1]];
  });
}

The matrix multiplication happens once. The per-vertex cost is a single matrix-vector multiply plus a vector addition. With 100 vertices, you do 1 matrix multiply and 100 cheap transforms instead of 200 separate operations.

Key Takeaway: Every 2D linear transformation is a 2x2 matrix. Rotation, scaling, shearing, and reflection each have a clean matrix form. Combining transforms means multiplying matrices -- right to left. Pre-compute the combined matrix once, then apply it to every vertex. This is the foundation of every 2D graphics pipeline.

What's next

There's one essential operation missing from our toolbox: translation -- moving an object to a different position. Look at step 5 of our worked example: we had to add the position separately because a 2×22 \times 2 matrix can't move the origin. Every linear transformation fixes the origin in place.

Homogeneous coordinates solve this elegantly. By embedding 2D points into 3D space (adding a third coordinate that's always 1), translation becomes a matrix multiplication too. Then all transforms -- scale, rotate, shear, reflect, and translate -- can be composed into a single 3×33 \times 3 matrix. One multiply per vertex handles everything.