OpenGL Transforms and the Inverse Model View
Below, I mentioned that I stored the inverse modelview matrix in gl_TextureMatrix[0] . Why did I do this? Well, I needed to transform the vertices and vectors from eye space to world space. Unfortunately, OpenGL sets up the Model View matrix to transform from object space to camera space; the inverse of that would skip right over world space and go straight back to object space. Solution? Once the camera transform is set up, invert it and store it in the texture matrix. This way, we can transform something to camera space (which we usually do anyway), then transform it to world space by removing the camera transform. But first, a review of all the transforms OpenGL does.
Object Space to Image Space (Quick and Dirty)
First, how do we move from object space to image space in OpenGL? Well, we start off with a vertex V in object space, simple enough. Then, some transforms are applied to the object. Let’s say it gets rotated, translated, scaled, rotated again and translated once more. Each of these transforms can be described by their own matrix, but for simplicity’s sake, we’ll say that they were all multiplied into a single matrix - M. So, we have V, a vertex in object space, and now V * M, a vertex in world space.
Next, there’s the transformation from world space to camera space. In OpenGL, camera space consists of right = +X, up = +Y, and forward = -Z. This can be done using the command gluLookAt with the camera’s location, point of focus and up vector. We’ll call this matrix C. Moving a vertex from object space to camera space is then V * M * C, seeing the pattern?
Once we are in camera space, we use the projection matrix (specified with functions like glFrustum and gluOrtho2D) to again change the coordinate space. We’ll call this matrix P, and the sequaence V * M * C * P will tell us which vertexes will be clipped. Once in this space (and after the perspective divide), if x or y coordinates of the vertex are outside [-1,+1] or if the z coordinate is outside [0, +1], then the vertex is clipped.
So, what do we have? Taking a vertex V in object space, multiplying it by M moves it to world space, multiplying that by C moves it to camera space, multiplying it P moves it to projected space. V * M * C * P.
Model View Matrix: Object Space to Camera Space
The matrix created to move a vertex to world space is defined by the various transforms applied to that vertex in OpenGL through functions like glTranslate, glRotate and glScale. Loading on the transforms creates a single matrix to change to world space. It’s also worth noting that the final matrix has an inverse and it is very easy to create if you know the tranforms that were used to create it. Let’s say you translate something by 10, rotate it by 90 and scale it by 0.5. This results in a 4×4 matrix where the original transforms have been muddled together to do the entire series of transforms at once. The inverse of this matrix would have to undo each of those transforms in the proper order. So the inverse can be constructed by scaling by 2.0, rotating by -90 and translating by -10. Easy as pie.
The camera space transform can be easily specified by the function gluLookAt. That link contains the implementation to create the matrix for OpenGL. Here’s a simpler to see implementation using GLSL like C++ types.
{
// forward pointing vector
vec3 f(center - eye);
f.normalize();
up.normalize();
// right pointing vector
vec3 s(f.cross(up));
// orthonormal up vector
vec3 u(s.cross(f));
s.normalize();
u.normalize();
// construct orthonormal orientation transfom
mat4 Orient(
s.x, s.y, s.z, 0.0f,
u.x, u.y, u.z, 0.0f,
-f.z, -f.y, -f.z, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f
);
// translate the new coordinate system to the origin
mat4 Translate(
1.0f, 0.0f, 0.0f, -eye.x,
0.0f, 1.0f, 0.0f, -eye.y,
0.0f, 0.0f, 1.0f, -eye.z,
0.0f, 0.0f, 0.0f, 1.0f
);
return Orient * Translate;
}
If you pay close attention to the Orient matrix, you’ll see that the rows represent the right, up and -forward that will correspond to +X, +Y and +Z (remember that the OpenGL camera looks down -Z). This is a standard change of orientation matrix; this will effectively rotate the world around so that the +X, +Y, and +Z axis line up with the right, up and -forward vectors of the camera. We then translate by the position of the camera to move the old origin to the camera’s origin.
One sticky part to note about all of this is that we have been working with row-wise matrices that are intended to be post multiplied. That is, we start with the vertex, then mutliply by the tranforms, then mutiply by the camera matrix. But, OpenGL uses pre multiplied matrices to achive the same effect (that’s why you clear the model view matrix, then specify the camera, then specify the tranforms and finally the vertex last). In order to end up with the same matrices in the end, OpenGL needs to use transposed matrices, or column-wise matrices. This means that our function needs to be reworked. One important point to remember: the transpose of multiplied matrices is the same as multiplying the transpose of the matrices in reverse order, ie. (A * B)T = BT * AT. Here is the correct OpenGL Friendly code.
{
vec3 f(center - eye);
f.normalize();
up.normalize();
vec3 s(f.cross(up));
vec3 u(s.cross(f));
s.normalize();
u.normalize();
mat4 Orient(
s.x, u.x, -f.x, 0.0f,
s.y, u.y, -f.y, 0.0f,
s.z, u.z, -f.z, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f
);
mat4 Translate(
1.0f, 0.0f, 0.0f, 0.0f,
0.0f, 1.0f, 0.0f, 0.0f,
0.0f, 0.0f, 1.0f, 0.0f,
-eye.x, -eye.y, -eye.z, 1.0f
);
return Translate * Orient;
}
It is important to realise that OpenGL’s Model View matrix is actually M * C, so, if we want to get the position of the vertex in world space, we will have to use the modelview matrix to go to camera space, then multiply by C-1 (the inverse of the camera matrix). This gives us V * M * C * C-1 = V * gl_ModelViewMatrix * C-1 = V * M. And world space is exactly what we need to do shadow mapping, refraction, reflectiona and a plethora of other things. OpenGL took care of this for us with the GL_ARB_Shadow extension. By using glTexGen, OpenGL computes the inverse camera matrix and applies it without us having to fuss with inverses. But, now we’ll need to do an inverse ourselves.
Inverse gluLookAt
Remeber that the camera matrix is given to us by gluLookAt, and above, we have the exact implementation of how the matrix is created. If you know your matrices, you may notice some properties that the camera matrix adheres to when being constructed this way. Normally, for a generic matrix, the inverse is a slow and painful process. You really don’t want to invert a lot of matrices in order to run your program in real time. Luckily, the camera matrix isn’t just a generic matrix, it has a specific construction with easily invertable properties.
If you paid attention to the construction of the Orient matrix, you would have seen that the columns represent an orthonormal coordinate system. That is, the vectors, s, u and -f are all at 90 degree angles to each other and their lengths are equal to 1. The property that this gives us is that the dot product of any two of those vectors equal 0 when the vectors are different. When the vectors are the same, the dot product is the length of the vector squared, or 1.
Now, imagine a matrix multiply as the dot product of two vectors, the row of the first matrix and the column of the second matrix. What we want to do is construct the inverse of Orient such that Orient * Orient-1 equals the identity matrix (0’s everywhere with 1’s on the diagonals), see where I’m going? As it turns out, the inverse of a matrix that represents an orthonormal coordinate system is exactly it’s transpose.
s.x, u.x, -f.x, 0.0f,
s.y, u.y, -f.y, 0.0f,
s.z, u.z, -f.z, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f
);
mat4 OrientInverse(
s.x, s.y, s.z, 0.0f,
u.x, u.y, u.z, 0.0f,
-f.x, -f.y, -f.z, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f
);
But the camera matrix also contains a translation that gets multiplied to Orient. We know the inverse of a translation is just a negative translation, so that’s easy. So, if C = Translate * Orient, then C-1 = (Translate * Orient)-1 = Orient-1 * Translate-1. Hey, we know all those pieces! Let’s put it into OpenGL friendly code.
{
vec3 f(center - eye);
f.normalize();
up.normalize();
vec3 s(f.cross(up));
vec3 u(s.cross(f));
s.normalize();
u.normalize();
mat4 OrientInverse(
s.x, s.y, s.z, 0.0f,
u.x, u.y, u.z, 0.0f,
-f.x, -f.y, -f.z, 0.0f,
0.0f, 0.0f, 0.0f, 1.0f
);
mat4 TranslateInverse(
1.0f, 0.0f, 0.0f, 0.0f,
0.0f, 1.0f, 0.0f, 0.0f,
0.0f, 0.0f, 1.0f, 0.0f,
eye.x, eye.y, eye.z, 1.0f
);
return OrientInverse * TranslateInverse;
}
That is probably the most diffucult part of doing shadow mapping and other real time techniques (once you get past all of the theory, that is). And look, it’s all wrapped up in a tiny function! Store that into a texture matrix and your shaders have quick and easy access to world space.
