Chapter 16
2D Transformations
Rotation, scaling, shearing, reflection -- you've met them individually. Now let's see them as matrices, combine them, and build real 2D graphics pipelines.
Every 2D linear transformation can be written as a 2x2 matrix. We've seen this abstractly: the columns of the matrix are where the basis vectors land. But when you're building a game engine, a UI framework, or a data visualization, you need the specific matrices for specific operations. What's the matrix for a 45-degree rotation? For flipping an image? For stretching a sprite wider without making it taller?
This chapter gives you the toolbox. Each transformation gets its own matrix, its own visual, and its own formula. Then we combine them -- because real graphics pipelines chain multiple transformations together, and the combined effect is a single matrix you can apply to every vertex at once.
We'll use a letter "F" as our test shape throughout. The F is perfect for this because it's asymmetric -- you can immediately tell when it's been rotated, flipped, or sheared. A circle or a square would hide half the transformation from you.
Rotation
The rotation matrix for angle (counterclockwise) is:
Where does this come from? Rotating by gives . Rotating by gives . Those become the columns of the matrix. That's it -- the matrix is just "where do the basis vectors land?"
For , we have :
Watch the F rotate 45 degrees counterclockwise. The original is dashed, the rotated version is solid.
The dashed F is the original. The solid F has been rotated 45 degrees counterclockwise. Every point swings along a circular arc around the origin. The basis vectors and both rotate by the same angle, staying perpendicular and unit-length.
Notice what rotation preserves: lengths, angles, and the shape of the F. The determinant of any rotation matrix is -- no area change. Rotation is the most "well-behaved" transformation.
Scaling
Non-uniform scaling stretches space by different amounts along each axis. The matrix is:
Each basis vector just gets multiplied by its corresponding scale factor. becomes and becomes . The zeros mean the axes don't tilt -- they stay aligned with the coordinate axes.
Let's scale by horizontally and vertically. The F gets stretched wide and squished short.
The F is stretched to double its width but squished to 60% of its height. Horizontal proportions are preserved -- the crossbars still look right -- but the overall aspect ratio is distorted. The determinant is , so area increases by 20%.
The diagonal structure of the scaling matrix is what keeps the axes aligned. If you scale uniformly (), the matrix becomes and every direction gets scaled equally -- a pure zoom.
Shear
A horizontal shear tilts the vertical axis while keeping the horizontal axis fixed. The matrix is:
The basis vector doesn't move (column 1 is ). But slides sideways to -- still unit height, but shifted horizontally by . Points higher up get pushed further to the right.
With , every point at height slides right by :
The bottom of the F stays put (points at don't move). Points higher up slide further right. The F looks italicized -- that's exactly what italic text is, geometrically: a shear. The dashed lines show how much each height level has shifted.
Shearing is subtle because it preserves area -- the determinant of a shear matrix is always 1. The shape is distorted, but the total area it covers doesn't change. This makes shears invisible to determinant-based measurements. You need to look at the shape itself to see that angles have changed.
Reflection
Reflection across the -axis negates the -coordinate of every point. The matrix is:
This sends to -- it flips the horizontal axis -- while leaving unchanged. The result is a mirror image: left becomes right, right becomes left.
The F is mirrored across the -axis. The crossbars now point left instead of right. The dashed lines connect corresponding points, showing equal distances from the mirror line. This is the visual signature of a negative determinant: orientation is reversed.
The determinant of the reflection matrix is . That negative sign is the mathematical fingerprint of a reflection: it tells you the transformation reverses orientation. A clockwise-wound polygon becomes counterclockwise. In graphics programming, this flips the winding order of triangles, which is how renderers decide which face of a surface you're looking at.
You can reflect across any line through the origin. Reflection across the -axis uses . Reflection across the line uses . Each one has determinant .
Composition: scale then rotate
The real power comes from combining transformations. In practice, you rarely apply just one. A game sprite needs to be scaled to the right size, rotated to face the right direction, and (eventually, with homogeneous coordinates) translated to the right position.
Let's scale the F by -- uniform scaling to 150% -- and then rotate it by 30 degrees. We do the scaling first, then the rotation. In matrix notation, that means -- the rightmost matrix is applied first.
The combined matrix:
Three stages: the faint dashed F is the original. The orange dashed outline shows the intermediate result after scaling by 1.5. The solid blue F is the final result after both scaling and rotating 30 degrees. The combined matrix in the box represents both operations as a single transformation.
The key insight: you can pre-compute that combined matrix once and then apply it to every vertex of your shape. Whether your sprite has 4 vertices or 4000, you multiply each one by the same matrix. This is why game engines and graphics APIs work with transformation matrices -- they collapse an entire pipeline of operations into a single matrix multiply per vertex.
Remember: matrices compose right to left. The matrix on the right acts first. means "scale first (), then rotate ()." If you did instead, you'd rotate first and then scale, which gives a different result (as we saw in the matrix multiplication chapter).
The formal bit
Every 2D linear transformation corresponds to a matrix. Here are the essential ones:
Rotation by angle counterclockwise:
Scaling by factors and :
Horizontal shear by factor :
Reflection across the -axis:
Some important patterns:
- Rotations and shears preserve area (), but shears change angles while rotations don't.
- Scaling changes area by a factor of .
- Reflections have , signaling orientation reversal.
- Composing any of these (by matrix multiplication) gives another matrix. The resulting transformation is always linear -- grid lines stay straight and evenly spaced, and the origin stays fixed.
- Composition is not commutative. in general. The order you apply transformations matters.
One more pattern worth noting: rotation matrices are orthogonal (their columns are perpendicular unit vectors, and ). Undoing a rotation is the same as transposing its matrix -- no expensive inverse computation needed.
Worked example: building a sprite transform pipeline
You're building a 2D game. An enemy spaceship sprite needs to be positioned on screen. The sprite's vertices are defined in "model space" -- a local coordinate system where the ship is centered at the origin, pointing right, with a wingspan of 2 units.
The game logic says: this frame, the ship should be at position , rotated 60 degrees, and scaled to 80% size. You need to transform every vertex from model space to screen space.
Step 1: Scale to 80%.
This shrinks the ship uniformly. A wingtip at moves to .
Step 2: Rotate 60 degrees (, ).
Step 3: Compose. The combined scale-then-rotate matrix is :
Step 4: Apply to every vertex. For the wingtip at :
The wingtip moved from to approximately -- scaled down and rotated.
Step 5: Translation (adding position ) can't be done with a matrix -- that's a limitation we'll solve in the next chapter. For now, you'd add the position vector after the matrix multiply:
function transformSprite(vertices, scaleMatrix, rotMatrix, position) {
// Compose once
const M = multiply2x2(rotMatrix, scaleMatrix);
// Apply to all vertices
return vertices.map(v => {
const transformed = multiply2x2byVec(M, v);
return [transformed[0] + position[0],
transformed[1] + position[1]];
});
}
The matrix multiplication happens once. The per-vertex cost is a single matrix-vector multiply plus a vector addition. With 100 vertices, you do 1 matrix multiply and 100 cheap transforms instead of 200 separate operations.
Key Takeaway: Every 2D linear transformation is a 2x2 matrix. Rotation, scaling, shearing, and reflection each have a clean matrix form. Combining transforms means multiplying matrices -- right to left. Pre-compute the combined matrix once, then apply it to every vertex. This is the foundation of every 2D graphics pipeline.
What's next
There's one essential operation missing from our toolbox: translation -- moving an object to a different position. Look at step 5 of our worked example: we had to add the position separately because a matrix can't move the origin. Every linear transformation fixes the origin in place.
Homogeneous coordinates solve this elegantly. By embedding 2D points into 3D space (adding a third coordinate that's always 1), translation becomes a matrix multiplication too. Then all transforms -- scale, rotate, shear, reflect, and translate -- can be composed into a single matrix. One multiply per vertex handles everything.