matrix conventions

Matrix transformations are very powerful and useful in computer graphics. But they can be tricky to get right. This article won’t talk about the detailed mathematics of matrices—you can find plenty of descriptions of that elsewhere—but about the practicalities of getting matrix transformations correct in your program code.

First of all, let me offer two important rules when writing matrix-manipulation code:

Now let me give a handy tip for trying to envision the effect of a sequence of matrix transformations. Assuming the usual pre-multiplication-of-column-vectors convention, we can imagine each matrix as a magic box that, when we look through it, transforms the appearance of space on the other side, making things bigger or smaller, rotating or repositioning them and so on. So if we represent object space by the object we are looking at (here a simple line-drawing of a teapot), ahd world space by the eye of the observer, then a transformation like

world-space vector ← matrix ← object-space vector

can be visualized as

where the purple arrows show the flow of geometric information from object space (coordinate system (xo, yo)), through the matrix transformation box, to the observer’s eye (coordinate system (xw, yw)).

The basic principles apply equally to both 2D and 3D graphics. Here we are dealing with 2D graphics in Cairo. The examples will be in Python, using the high-level Qahirah binding. This lets us represent the transformation of a Vector object by a Matrix object directly as a simple Python expression:

user_space_coord = matrix * object_space_coord

Let us envision what happens if we apply a simple rotational transform to the object.

By superimposing both coordinate systems in the upper part of the diagram (the current one in black, the previous one in grey), we can see the differing effects of, for example, moving parallel to the axes in object space coordinates (xo, yo) versus world space coordinates (xw, yw). In the Python code, we can spread things out across multiple lines, to more closely approximate the arrangement in the diagram:

user_space_coord = \
    (
        Matrix.rotate(45 * deg)
    *
        object_space_coord
    )

Now, what happens if we apply two transformations in succession?

Here the transformations (in order from object space to world space) are rotation about the origin, followed by translation along the positive y-axis. The rotation converts from object coordinates (xo, yo) to the intermediate coordinate system (xm, ym). The translation then converts from (xm, ym) to (xw, yw) coordinates. The equivalent Python code would be something like

user_space_coord = \
    (
        Matrix.translate((0, 10))
    *
        Matrix.rotate(45 * deg)
    *
        object_space_coord
    )

Here the order is reversed, the y-axis translation being applied first:

Thus, the rotation takes place, not about the (xo, yo) origin, but about the (xm, ym) origin.

Each transformation (blue background) is applied in the coordinate system of the picture of the object immediately below it (yellow background).

Note that, while the orientation of the teapot ends up the same in both these cases, its position is different. The equivalent Python code would be correspondingly rearranged to match the diagram:

user_space_coord = \
    (
        Matrix.rotate(45 * deg)
    *
        Matrix.translate((0, 10))
    *
        object_space_coord
    )

So, when you look at the Python code, imagine the eye of the observer on the receiving end of the value of the expression, at the top, while the object coordinates are at the bottom, being processed through successive stages of the transformation until they get to the top. Each individual Matrix object corresponds to one of the boxes with a blue background, while the multiplication asterisk immediately below it corresponds to the picture with the yellow background immediately below that box.