Introduction
The main purpose of this article is to gather information in the field of transformation in computer graphics and put it in one place. All the information contained in this article can be found in the depths of the Internet. Unfortunately, the information contained there about matrices and transformations is not exhaustive and does not always give details of the subtleties involved. In this article, I want to discuss these subtleties and give you a full picture of the problem.
One of the two culprits associated with transformations in computer graphics, invisible at first glance, is the property of matrix multiplication, which in general is not commutative. Of course, there are examples of matrices whose multiplication is commutative, such pairs are just the exception to the rule.
\begin{array}{c} \mathbf{A} * \mathbf{B} \neq \mathbf{B} * \mathbf{A} \end{array}
where A and B are matrices.
We’re used to the concept of commutative multiplication of numbers (scalars). Unfortunately, this property does not apply to matrix multiplication. This fact gives rise to some surprising rules that must be known and consistently adhered to.
Secondly, in a 3D Cartesian coordinate system, the result of the cross product of two non-collinear vectors is anticommutative.
\begin{array}{c} \vec{a} \times \vec{b} = - \vec{b} \times \vec{a} \end{array}
This property means the cross-product doesn’t have one solution in 3D Cartesian coordinate system but two equally correct solutions. To solve this ambiguity we need a choice of orientation (or “handedness”) for the 3D Cartesian coordinate system, and by choosing it we will prefer one cross-product solution over another.
Again, at first glance, we might say “it’s just adding a minus sign”, however it’s not. This property not only determines the direction of the surface normal vectors, but also determines the direction in which the points rotate around the axis. So when working with transformations in computer graphics, we must first choose the orientation we want and stick to this choice throughout.
Problem complexity
In the real world, many problems often look very simple at first glance, and only after a closer look the complexity reveals itself very quickly. In this article, we want to get to the bottom of this problem. To deal with this, let’s use the old and well-known divide and conquer technique. So let’s break this complex problem down into separate sub-problems then tackle and solve them separately. This approach will help us ultimately create a neat cheat sheet for all of us and hopefully it finally sums up all the “mess” we encounter when working with transformations in computer graphics.
These subproblems can be divided not only from the point of view of mathematics (theoretical), but also from the point of view of implementation (practical) into:
- Matrix storage in memory as a multidimensional array.
- Matrix multiplication order.
- 3D Cartesian coordinate system handedness.
Matrix storage in memory as a multidimensional array
In mathematics, a matrix is defined as a rectangular array of numbers arranged in rows and columns.
For example, the matrix below has 3 rows and 5 columns, and can be referred to as a \mathbf{3 \times 5} matrix.
\begin{array}{c} \begin{bmatrix} m_{11} & m_{12} & m_{13} & m_{14} & m_{15}\\ m_{21} & m_{22} & m_{23} & m_{24} & m_{25}\\ m_{31} & m_{32} & m_{33} & m_{34} & m_{35}\\ \end{bmatrix} \end{array}
But how can we represent this matrix in computer memory?
A common solution is to use multidimensional arrays. However computer memory is linear. In other words, it’s a one-dimensional array. Consequently, multidimensional arrays can be stored using two “simple” layout orders:
Row layout order | ![]() |
Column layout order | ![]() |
These “simple” layouts are used to store multidimensional arrays(matrices). The layout used depends on the programming language:
Row layout order | C C++ HLSL (by default or with row_major modifier)GLSL (with layout(row_major) qualifier) |
Column layout order | Fortran Matlab HLSL (with column_major modifier)GLSL (by default or with layout(column_major) qualifier) |
Multidimensional arrays in C/C++
As the above table describes, in C/C++ multidimensional arrays are stored in row-major order which can be understood as storing multidimensional arrays in memory row by row.
For example, the following matrix:
\begin{array}{c} \begin{bmatrix} m_{11} & m_{12} & m_{13} & m_{14} & m_{15}\\ m_{21} & m_{22} & m_{23} & m_{24} & m_{25}\\ m_{31} & m_{32} & m_{33} & m_{34} & m_{35}\\ \end{bmatrix} \end{array}
in memory is stored as:
\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|} \hline & m_{11} & m_{12} & m_{13} & m_{14} & m_{15} & m_{21} & m_{22} & m_{23} & m_{24} & m_{25} & m_{31} & m_{32} & m_{33} & m_{34} & m_{35} & \\ \hline \end{array}
The same multidimensional array in C/C++ can be written as follows:
int foo [3][5];
where the variable foo
is an array containing 3
arrays which all contain 5
variables of type int
.
How do we interpret this multidimensional array as a matrix, especially those two numbers 3
and 5
? There are unfortunately two possibilities again.
[Row][Column]
orderIn this possibility e.g. the
foo[3][5]
is interpreted as \mathbf{3 \times 5} matrix, which would be more intuitive to mathematicians because is “closer” to matrix definition (order of arrays dimension[3][5]
coincides with matrix dimensions 3 \times 5).To access the elements of the matrix, we specify the desired row in the first square bracket, and the desired column in the second. Also in this possibility, the rows of the matrix are the rows of the multidimensional array.
[Column][Row]
orderIn this possibility e.g. the
foo[3][5]
is interpreted as \mathbf{5 \times 3} matrix.To access the elements of the matrix, we specify the desired column in the first square bracket, and the desired row in the second. Also in this possibility, the columns of the matrix are the rows of the multidimensional array.
The output of this possibility is that even though C++ in multidimensional arrays are stored as Row-Major layout we can “pretend” that they are stored as Column-Major.
Matrices in HLSL
In HLSL there are special data types for representing matrices up to 4 \times 4 and vectors up to 4 components.
For example, the float2x4
(with any modifier) data type is used to represent a 2 \times 4 matrix with float2
representing the 2 component row/column vector.
HLSL does have an overloaded *
operator but this operator is doing component-wise multiplication (each element of first matrix is multiplied by the corresponding element of the second matrix). Developers need to use mul(x, y)
function to multiply vector/matrices. Sample HLSL source code might be as follows:
Vectors v
and vp
are treated as row vectors and this operation can be written in mathematical form as:
\begin{array}{c} \vec{v_p} = \vec{v} * \mathbf{m} \\ \end{array} \\ \begin{array}{c} \begin{bmatrix} x & y & z \end{bmatrix} = \begin{bmatrix} v_1 & v_2 & v_3 & v_4 \end{bmatrix} * \begin{bmatrix} m_{11} & m_{12} & m_{13}\\ m_{21} & m_{22} & m_{23}\\ m_{31} & m_{32} & m_{33}\\ m_{41} & m_{42} & m_{43}\\ \end{bmatrix} \end{array}
When we do it the other way:
vectors v
and vp
are treated as column vectors and the operation can be written in mathematical form as:
\begin{array}{c} \vec{v_p} = \mathbf{m} * \vec{v}\\ \end{array} \\ \begin{array}{c} \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} m_{11} & m_{12} & m_{13} & m_{14}\\ m_{21} & m_{22} & m_{23} & m_{24}\\ m_{31} & m_{32} & m_{33} & m_{34}\\ \end{bmatrix} * \begin{bmatrix} v_1 \\ v_2 \\ v_3 \\ v_4 \end{bmatrix} \end{array}
Lastly, when multiplying two vectors using the mul
function:
it is the equivalent of the dot product of those vectors (i.e. dot(x, y)
).
\begin{array}{c} d = \vec{a} \cdot \vec{b} \\ \end{array} \\ \begin{array}{c} d = \begin{bmatrix} a_1 & a_2 & a_3 & a_4\\ \end{bmatrix} * \begin{bmatrix} b_1 \\ b_2 \\ b_3 \\ b_4 \end{bmatrix} \end{array}
Matrices in GLSL
In GLSL there are special data types for representing matrices up to 4 \times 4 and vectors with up to 4 components.
For example, the mat2x4
(with any modifier) data type is used to represent a 4 \times 2 matrix with vec2
representing a 2 component row/column vector.
GLSL has an overloaded *
operator which is used to multiply scalars as well as multiply matrices and vectors. Sample GLSL source code might be as follows:
Vectors v
and vp
are treated as row vectors. This operation can be written in mathematical form as:
\begin{array}{c} \vec{v_p} = \vec{v} * \mathbf{m} \\ \end{array} \\ \begin{array}{c} \begin{bmatrix} x & y & z \end{bmatrix} = \begin{bmatrix} v_1 & v_2 & v_3 & v_4 \end{bmatrix} * \begin{bmatrix} m_{11} & m_{12} & m_{13}\\ m_{21} & m_{22} & m_{23}\\ m_{31} & m_{32} & m_{33}\\ m_{41} & m_{42} & m_{43}\\ \end{bmatrix} \end{array}
When we do it the other way
vectors v
and vp
are treated as column vectors. This operation we can written in mathematics form as:
\begin{array}{c} \vec{v_p} = \mathbf{m} * \vec{v}\\ \end{array} \\ \begin{array}{c} \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} m_{11} & m_{12} & m_{13} & m_{14}\\ m_{21} & m_{22} & m_{23} & m_{24}\\ m_{31} & m_{32} & m_{33} & m_{34}\\ \end{bmatrix} * \begin{bmatrix} v_1 \\ v_2 \\ v_3 \\ v_4 \end{bmatrix} \end{array}
Lastly, when multiply two vectors using the *
operator:
we do not compute the dot product. Instead, the component-wise multiplication is done instead.
To compute the dot product, the dot(x, y)
function needs to be used:
Matrix multiplication order
In 3D applications, it is important to combine transformations such as translation, rotation, or scaling to create realistic virtual worlds. Since all transformations used in computer graphics can be represented as a 4 \times 4 matrix and the concatenation of a transformation is just a matrix multiplication, it’s obvious that the order of the matrices in the multiplication is crucial.
The rule of matrix multiplication states that the number of columns in the left operand must be equal to the number of rows in the right operand.
\begin{array}{c} \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{32} \\ a_{41} & a_{42} \\ \end{bmatrix} * \begin{bmatrix} b_{11} & b_{12} & b_{13}\\ b_{21} & b_{22} & b_{23}\\ \end{bmatrix} = \begin{bmatrix} c_{11} & a_{12} & c_{13}\\ c_{21} & c_{22} & c_{23}\\ c_{31} & c_{32} & c_{33}\\ c_{41} & c_{42} & c_{43}\\ \end{bmatrix} \end{array}
Matrix multiplication (where A, B, C, D, E, Z are appropriate matrices) can be written as follows:
\begin{array}{c} \mathbf{Z} = \mathbf{A} * \mathbf{B} * \mathbf{C} * \mathbf{D} * \mathbf{E} \end{array}
Matrix multiplication in general is not commutative. So, to show our intentions regarding the sequence (order) of actions, it is helpful to use parentheses. Although in mathematics it is assumed that multiplication is from left to right.
Matrix multiplication is associative such that (\mathbf{A} * \mathbf{B}) * \mathbf{C} = \mathbf{A} * (\mathbf{B} * \mathbf{C}) is true.
What is described in the subsection below is only to underline the order (direction) in which the transformations are combined.
Pre-multiplication
The matrix pre-multiplication order convention can be described as placing parentheses in a way (shown below) to indicate the intention of a particular order in which an operation (e.g. a combination of transformations) is performed.
\begin{array}{c} \mathbf{Z} = \mathbf{A} * \mathbf{B} * \mathbf{C} * \mathbf{D} * \mathbf{E} = (((\mathbf{A} * \mathbf{B}) * \mathbf{C}) * \mathbf{D}) * \mathbf{E} \end{array}
The consequence of this convention is that the left-to-right transformations are composed. Thus, implicitly forcing the matrix to be stored in the row-major layout and treating the vertices as column vertices.
In common usage in 3D applications this means:
- Transform vector from one space to another
\vec{V_w} = Transformed vector in another space. A vector is a column vector
\vec{V_c} = Transformed vector in original space. A vector is a column vector
\mathbf{M} = Transform Matrix
\begin{array}{c} \vec{V_w} = \mathbf{M} * \vec{V_c} \\ \end{array} \\ \begin{array}{c} \begin{bmatrix} x \\ y \\ z \\ w \end{bmatrix} = \begin{bmatrix} m_{11} & m_{12} & m_{13} & m_{14}\\ m_{21} & m_{22} & m_{23} & m_{24}\\ m_{31} & m_{32} & m_{33} & m_{34}\\ m_{41} & m_{42} & m_{43} & m_{44}\\ \end{bmatrix} * \begin{bmatrix} v_1 \\ v_2 \\ v_3 \\ v_4 \end{bmatrix} \end{array}
- Computing Final Transformation Matrix
\mathbf{M_f} = Final Matrix
\mathbf{M_m} = Model transform, transforming from local model space to global world space
\mathbf{M_v} = View matrix, transforming from world space to camera space
\mathbf{M_p} = Projection Matrix, transforming from camera space to clipping space
\begin{array}{c} \mathbf{M_f} = \mathbf{M_m} * \mathbf{M_v} * \mathbf{M_p} \\ \end{array}
- World Matrix Node Graph composition
\mathbf{M_w} = World transformation matrix
\mathbf{M_l} = Current local transformation matrix
\mathbf{M_p} = Transformation matrix of the parent node
\begin{array}{c} \mathbf{M_w} = \mathbf{M_l} * \mathbf{M_p} \end{array}
- Computing Local Transformation Matrix
\mathbf{M_l} = Node current local transformation matrix
\mathbf{M_s} = Scaling matrix
\mathbf{M_r} = Rotation matrix
\mathbf{M_t} = Translation matrix
\begin{array}{c} \mathbf{M_l} = \mathbf{M_s} * \mathbf{M_r} * \mathbf{M_t} \\ \end{array}
Post-multiplication
The matrix post-multiplication order convention can be described as placing parentheses in a way (shown below) to indicate the intention of a particular order in which an operation (e.g. a combination of transformations) is performed.
\begin{array}{c} \mathbf{Z} = \mathbf{A} * \mathbf{B} * \mathbf{C} * \mathbf{D} * \mathbf{E} = \mathbf{A} * (\mathbf{B} * (\mathbf{C} * (\mathbf{D} * \mathbf{E}))) \end{array}
The consequence of this convention is that the right-to-left transformations are composed. Thus, implicitly forcing the matrix to be stored in the column-major layout and treating the vertices as row vertices.
In common usage in 3D applications this means:
- Transform vector from one space to another
\vec{V_w} = Transformed vector in another space. A vector is a row vector.
\vec{V_c} = Transformed vector in original space. A vector is a row vector.
\mathbf{M} = Transform Matrix.
\begin{array}{c} \vec{V_w} = \vec{V_c} * \mathbf{M} \\ \end{array} \\ \begin{array}{c} \begin{bmatrix} x & y & z & w\\ \end{bmatrix} = \begin{bmatrix} v_1 & v_2 & v_3 & v_4\\ \end{bmatrix} * \begin{bmatrix} m_{11} & m_{21} & m_{31} & m_{41}\\ m_{12} & m_{22} & m_{32} & m_{42}\\ m_{13} & m_{23} & m_{33} & m_{43}\\ m_{14} & m_{24} & m_{34} & m_{44}\\ \end{bmatrix} \end{array}
- Computing Final Transformation Matrix
\mathbf{M_f} = Final Matrix
\mathbf{M_m} = Model transform, transforming from local model space to global world space
\mathbf{M_v} = View matrix, transforming from world space to camera space
\mathbf{M_p} = Projection Matrix, transforming from camera space to clipping space
\begin{array}{c} \mathbf{M_f} = \mathbf{M_p} * \mathbf{M_v} * \mathbf{M_m} \\ \end{array}
- World Matrix Node Graph composition
\mathbf{M_w} = World transformation matrix
\mathbf{M_l} = Current local transformation matrix
\mathbf{M_p} = Transformation matrix of the parent node
\begin{array}{c} \mathbf{M_w} = \mathbf{M_p} * \mathbf{M_l} \end{array}
- Computing Local Transformation Matrix
\mathbf{M_l} = Node current local transformation matrix
\mathbf{M_s} = Scaling matrix
\mathbf{M_r} = Rotation matrix
\mathbf{M_t} = Translation matrix
\begin{array}{c} \mathbf{M_l} = \mathbf{M_t} * \mathbf{M_r} * \mathbf{M_s} \\ \end{array}
3D Cartesian coordinate system handedness
Before discussing the topic of the orientation of the 3D Cartesian coordinate system, one more issue should be mentioned and described, which is the “Up Direction” axial convention.
“Up Direction”
Mathematically, there is no preferred axis to be taken as the “Up Direction” for the 3D worlds. In 3D applications (fortunately) only two conventions have developed over the years which are:
- “Up Direction” is Z-Axis
This convention is very useful e.g. in architecture and when the “working space” is a plane that is the surface of e.g. the earth, then the two main working directions are X-Axis and Y-Axis, so then “Up Direction” (elevation) is Z-Axis.
- “Up Direction” is Y-Axis
In this convention, the “working plane” is the monitor screen, the X-Axis and Y-Axis directions are the same as the width and height of the screen, respectively, and the Z-Axis is the “depth” of the screen. So when a 3D virtual world is displayed on the screen, it simulates our everyday perception of the world, which we might experience in FPS games, for example. Therefore, the Y-Axis monitor screen describes “Up Direction”.
Coordinate system “handedness”
To determine the 3D Cartesian coordinate system, we need to select three vectors (axes) that are perpendicular to each other. Then, by selecting the X-Axis and the Y-Axis (or Z-Axis) perpendicular to it, we define a plane dividing the space into two half-spaces. The half-space which is “above” the XY-Plane (or XZ-Plane) is named positive, and the one “below” is named negative. This fact correlates with the cross-product in three-dimensional space. To get a unique and consistent result, we need to choose the direction of the resulting vector. This can be done by using the right or left-hand rule.
Left-hand rule | Right-hand rule |
---|---|
![]() |
![]() |
The right or left-hand rule can be briefly described as:
Let the straightened index finger of the right/left-hand point in the direction of the first vector (\vec{a}), and the bent middle finger point in the direction of the second vector (\vec{b}). Finally, depending on chosen hand the abducted thumb then indicates the direction of the cross-product vector \vec{a}\times\vec{b}.
To determine the directions of the X , Y , Z axes of the 3D Cartesian coordinate system, replace the first vector with the direction of the X-Axis, the second vector with the direction of the Y-Axis, then the thumb will indicate the direction of the Z-Axis.
In this way, we can designate a 3D Cartesian coordinate system with the desired handedness (orientation) which can be represented in the table below.
Left-hand rule | Right-hand rule | |
---|---|---|
Y-Up | ![]() |
![]() |
Z-Up | ![]() |
![]() |
The orientation of the coordinate system is the related direction of increasing angles around each axis. And it can be briefly described:
Point your thumb in the direction of the selected axis, the other bent fingers point in the direction in which the angles around this axis increase.
It can be represented in the table below.
Left-hand rule | Right-hand rule | |
---|---|---|
Y-Up | ![]() |
![]() |
Z-Up | ![]() |
![]() |
The table below shows examples of the Cartesian coordinate system used in 3D computer graphics programs and game engines:
Left-hand rule | Right-hand rule | |
---|---|---|
Y-Up | LightWave ZBrush Cinema 4D Unity |
Autodesk Maya Modo Houdini 3D Animation Tool Substance Painter Marmoset Toolbag Godot Object-Oriented Graphics Rendering Engine (OGRE) |
Z-Up | Unreal Engine Open 3D Engine |
Autodesk 3ds Max Blender SketchUp Autodeck AutoCAD CRYENGINE UNIGINE |
The last thing worth mentioning is that in mathematics and physics the right-hand orientation is the most common preferred orientation. That orientation is used to find directions e.g. in magnetic fields, electromagnetic fields, and enantiomers in chemistry.
Quaternions and 3D Cartesian coordinate system handedness
Unfortunately, the topic of “handedness” does not avoid quaternions either.
Hamilton multiplication convention
W.R. Hamilton, the inventor of quaternions, defined their properties (more precisely, the product of their basic elements \mathbf{ij = k}) and thus somewhat defined their “orientation” (in terms of rotations in 3D space). So now rotating vectors using quaternions boils down to their so-called “sandwich product” and is defined as
v' = QvQ^{-1}
where v is vector which is rotated by unit-quaternion Q and Q^{-1} is the conjugate.
The rotation matrix created using Hamilton’s definition can be identified as the right-hand orientation rotation matrix and it is as follows:
\begin{bmatrix} 1 - 2(Q_y^2 + Q_z^2) & 2(Q_xQ_y - Q_zQ_w) & 2(Q_xQ_z + Q_yQ_w)\\ 2(Q_xQ_y + Q_zQ_w) & 1 - 2(Q_x^2 + Q_z^2) & 2(Q_yQ_z - Q_xQ_w)\\ 2(Q_xQ_z - Q_yQ_w) & 2(Q_yQ_z + Q_xQ_w) & 1 - 2(Q_x^2 + Q_y^2)\\ \end{bmatrix}
The above definitions can be found in all mathematics textbooks dealing with this subject.
Shuster multiplication convention
However, there is an alternative convention proposed by M.D. Shuster and used in NASA’s Jet Propulsion Laboratory and, importantly in computer graphics, in the Microsoft DirectX Math Library. In Shuster’s convention, the quaternion product of the basis elements is now \mathbf{ij = -k} (in a way, it changes the orientation of space). In addition it is necessary to change the order of quaternions in a “sandwich product”
v' = Q^{-1}vQ
where v is vector which is rotated by unit-quaternion Q and Q^{-1} is the conjugate.
The rotation matrix created using Shuster’s definition can be identified as the left-hand orientation rotation matrix and it is as follows:
R = \begin{bmatrix} 1 - 2(Q_y^2 + Q_z^2) & 2(Q_xQ_y + Q_zQ_w) & 2(Q_xQ_z - Q_yQ_w)\\ 2(Q_xQ_y - Q_zQ_w) & 1 - 2(Q_x^2 + Q_z^2) & 2(Q_yQ_z + Q_xQ_w)\\ 2(Q_xQ_z + Q_yQ_w) & 2(Q_yQ_z - Q_xQ_w) & 1 - 2(Q_x^2 + Q_y^2)\\ \end{bmatrix}
Transformations matrices
All of the above considerations regarding, firstly, the non-commutative nature of matrix multiplication, and secondly, the anticommutativity of a vector’s cross product has the following consequences in four alternative ways of constructing the 3D transformation as a 4×4 matrix. All four of those alternatives are described in the individual sections below.
How to detect LH/RH Row/Column-Major Pre/Post Multiplication convention?
When the convention used in a particular application is unknown, answering the following questions will help to identify it:
- How is the translation matrix is constructed? It is necessary to check how the components of the translation vector XYZ are stored in memory. Are they stored next to each other (1a) or are they away from each other (1b)?
- It is necessary to inspect how rotation around X-Axis or Z-Axis is stored in memory. Is the first in memory stored -sin element (2a) or sin (2b), or in case of rotation around Y-Axis the first in memory is stored sin element (2a) or -sin (2b)?
- Determination of the order in which the transformations (Scale, Rotation, and Translation) are composed. Whether it is executed in the order S * R * T (3a) or T * R * S (3b)? Or if Matrix Vector multiplication is computed in order M * V (3a) or V * M (3b)?
With the answers to the above questions, the conventions we are looking for can be found in the table below:
Question 1 | Question 2 | Question 3 | \rightarrow | Convention |
---|---|---|---|---|
1a | 2a | 3a | \rightarrow | Pre-multiplication, Row-Major, Right-handed |
1b | 2a | 3a | \rightarrow | Pre-multiplication, Column-Major, Left-handed |
1a | 2b | 3a | \rightarrow | Pre-multiplication, Row-Major, Left-handed |
1b | 2b | 3a | \rightarrow | Pre-multiplication, Column-Major, Right-handed |
1a | 2a | 3b | \rightarrow | Post-multiplication, Column-Major, Left-handed |
1b | 2a | 3b | \rightarrow | Post-multiplication, Row-Major, Right-handed |
1a | 2b | 3b | \rightarrow | Post-multiplication, Column-Major, Right-handed |
1b | 2b | 3b | \rightarrow | Post-multiplication, Row-Major, Left-handed |
What’s coming next
- Yaw Pitch Roll Angles
- Composition order results of rotations around the coordinate axes.
- Gimbal Lock problem.
- Transition from one convention to another. Necessary transformations and their consequences on the data.
- Oblique and Axonometric Projections
- More Miscellaneous Matrices