OpenGL Transforms and the Inverse Model View

Below, I mentioned that I stored the inverse modelview matrix in gl_TextureMatrix[0] . Why did I do this? Well, I needed to transform the vertices and vectors from eye space to world space. Unfortunately, OpenGL sets up the Model View matrix to transform from object space to camera space; the inverse of that would skip right over world space and go straight back to object space. Solution? Once the camera transform is set up, invert it and store it in the texture matrix. This way, we can transform something to camera space (which we usually do anyway), then transform it to world space by removing the camera transform. But first, a review of all the transforms OpenGL does.

Object Space to Image Space (Quick and Dirty)

First, how do we move from object space to image space in OpenGL? Well, we start off with a vertex V in object space, simple enough. Then, some transforms are applied to the object. Let’s say it gets rotated, translated, scaled, rotated again and translated once more. Each of these transforms can be described by their own matrix, but for simplicity’s sake, we’ll say that they were all multiplied into a single matrix - M. So, we have V, a vertex in object space, and now V * M, a vertex in world space.

Next, there’s the transformation from world space to camera space. In OpenGL, camera space consists of right = +X, up = +Y, and forward = -Z. This can be done using the command gluLookAt with the camera’s location, point of focus and up vector. We’ll call this matrix C. Moving a vertex from object space to camera space is then V * M * C, seeing the pattern?

Once we are in camera space, we use the projection matrix (specified with functions like glFrustum and gluOrtho2D) to again change the coordinate space. We’ll call this matrix P, and the sequaence V * M * C * P will tell us which vertexes will be clipped. Once in this space (and after the perspective divide), if x or y coordinates of the vertex are outside [-1,+1] or if the z coordinate is outside [0, +1], then the vertex is clipped.

So, what do we have? Taking a vertex V in object space, multiplying it by M moves it to world space, multiplying that by C moves it to camera space, multiplying it P moves it to projected space. V * M * C * P.

Model View Matrix: Object Space to Camera Space

The matrix created to move a vertex to world space is defined by the various transforms applied to that vertex in OpenGL through functions like glTranslate, glRotate and glScale. Loading on the transforms creates a single matrix to change to world space. It’s also worth noting that the final matrix has an inverse and it is very easy to create if you know the tranforms that were used to create it. Let’s say you translate something by 10, rotate it by 90 and scale it by 0.5. This results in a 4×4 matrix where the original transforms have been muddled together to do the entire series of transforms at once. The inverse of this matrix would have to undo each of those transforms in the proper order. So the inverse can be constructed by scaling by 2.0, rotating by -90 and translating by -10. Easy as pie.

The camera space transform can be easily specified by the function gluLookAt. That link contains the implementation to create the matrix for OpenGL. Here’s a simpler to see implementation using GLSL like C++ types.

mat4 makeGluLookAt(vec3 eye, vec3 center, vec3 up)
{
    // forward pointing vector
    vec3 f(center - eye);
    f.normalize();
    up.normalize();

    // right pointing vector
    vec3 s(f.cross(up));

    // orthonormal up vector
    vec3 u(s.cross(f));

    s.normalize();
    u.normalize();

    // construct orthonormal orientation transfom
    mat4 Orient(
        s.x, s.y, s.z, 0.0f,
        u.x, u.y, u.z, 0.0f,
        -f.z, -f.y, -f.z, 0.0f,
        0.0f, 0.0f, 0.0f, 1.0f
    );

    // translate the new coordinate system to the origin
    mat4 Translate(
        1.0f, 0.0f, 0.0f, -eye.x,
        0.0f, 1.0f, 0.0f, -eye.y,
        0.0f, 0.0f, 1.0f, -eye.z,
        0.0f, 0.0f, 0.0f, 1.0f
    );

    return Orient * Translate;
}

If you pay close attention to the Orient matrix, you’ll see that the rows represent the right, up and -forward that will correspond to +X, +Y and +Z (remember that the OpenGL camera looks down -Z). This is a standard change of orientation matrix; this will effectively rotate the world around so that the +X, +Y, and +Z axis line up with the right, up and -forward vectors of the camera. We then translate by the position of the camera to move the old origin to the camera’s origin.

One sticky part to note about all of this is that we have been working with row-wise matrices that are intended to be post multiplied. That is, we start with the vertex, then mutliply by the tranforms, then mutiply by the camera matrix. But, OpenGL uses pre multiplied matrices to achive the same effect (that’s why you clear the model view matrix, then specify the camera, then specify the tranforms and finally the vertex last). In order to end up with the same matrices in the end, OpenGL needs to use transposed matrices, or column-wise matrices. This means that our function needs to be reworked. One important point to remember: the transpose of multiplied matrices is the same as multiplying the transpose of the matrices in reverse order, ie. (A * B)T = BT * AT. Here is the correct OpenGL Friendly code.

mat4 makeGluLookAt(vec3 eye, vec3 center, vec3 up)
{
    vec3 f(center - eye);
    f.normalize();
    up.normalize();

    vec3 s(f.cross(up));
    vec3 u(s.cross(f));

    s.normalize();
    u.normalize();

    mat4 Orient(
        s.x, u.x, -f.x, 0.0f,
        s.y, u.y, -f.y, 0.0f,
        s.z, u.z, -f.z, 0.0f,
        0.0f, 0.0f, 0.0f, 1.0f
    );

    mat4 Translate(
        1.0f, 0.0f, 0.0f, 0.0f,
        0.0f, 1.0f, 0.0f, 0.0f,
        0.0f, 0.0f, 1.0f, 0.0f,
        -eye.x, -eye.y, -eye.z, 1.0f
    );

    return Translate * Orient;
}

It is important to realise that OpenGL’s Model View matrix is actually M * C, so, if we want to get the position of the vertex in world space, we will have to use the modelview matrix to go to camera space, then multiply by C-1 (the inverse of the camera matrix). This gives us V * M * C * C-1 = V * gl_ModelViewMatrix * C-1 = V * M. And world space is exactly what we need to do shadow mapping, refraction, reflectiona and a plethora of other things. OpenGL took care of this for us with the GL_ARB_Shadow extension. By using glTexGen, OpenGL computes the inverse camera matrix and applies it without us having to fuss with inverses. But, now we’ll need to do an inverse ourselves.

Inverse gluLookAt

Remeber that the camera matrix is given to us by gluLookAt, and above, we have the exact implementation of how the matrix is created. If you know your matrices, you may notice some properties that the camera matrix adheres to when being constructed this way. Normally, for a generic matrix, the inverse is a slow and painful process. You really don’t want to invert a lot of matrices in order to run your program in real time. Luckily, the camera matrix isn’t just a generic matrix, it has a specific construction with easily invertable properties.

If you paid attention to the construction of the Orient matrix, you would have seen that the columns represent an orthonormal coordinate system. That is, the vectors, s, u and -f are all at 90 degree angles to each other and their lengths are equal to 1. The property that this gives us is that the dot product of any two of those vectors equal 0 when the vectors are different. When the vectors are the same, the dot product is the length of the vector squared, or 1.

Now, imagine a matrix multiply as the dot product of two vectors, the row of the first matrix and the column of the second matrix. What we want to do is construct the inverse of Orient such that Orient * Orient-1 equals the identity matrix (0’s everywhere with 1’s on the diagonals), see where I’m going? As it turns out, the inverse of a matrix that represents an orthonormal coordinate system is exactly it’s transpose.

mat4 Orient(
    s.x, u.x, -f.x, 0.0f,
    s.y, u.y, -f.y, 0.0f,
    s.z, u.z, -f.z, 0.0f,
    0.0f, 0.0f, 0.0f, 1.0f
);

mat4 OrientInverse(
    s.x, s.y, s.z, 0.0f,
    u.x, u.y, u.z, 0.0f,
    -f.x, -f.y, -f.z, 0.0f,
    0.0f, 0.0f, 0.0f, 1.0f
);

But the camera matrix also contains a translation that gets multiplied to Orient. We know the inverse of a translation is just a negative translation, so that’s easy. So, if C = Translate * Orient, then C-1 = (Translate * Orient)-1 = Orient-1 * Translate-1. Hey, we know all those pieces! Let’s put it into OpenGL friendly code.

mat4 makeGluLookAtInverse(vec3 eye, vec3 center, vec3 up)
{
    vec3 f(center - eye);
    f.normalize();
    up.normalize();

    vec3 s(f.cross(up));
    vec3 u(s.cross(f));

    s.normalize();
    u.normalize();

    mat4 OrientInverse(
        s.x, s.y, s.z, 0.0f,
        u.x, u.y, u.z, 0.0f,
        -f.x, -f.y, -f.z, 0.0f,
        0.0f, 0.0f, 0.0f, 1.0f
    );

    mat4 TranslateInverse(
        1.0f, 0.0f, 0.0f, 0.0f,
        0.0f, 1.0f, 0.0f, 0.0f,
        0.0f, 0.0f, 1.0f, 0.0f,
        eye.x, eye.y, eye.z, 1.0f
    );

    return OrientInverse * TranslateInverse;
}

That is probably the most diffucult part of doing shadow mapping and other real time techniques (once you get past all of the theory, that is). And look, it’s all wrapped up in a tiny function! Store that into a texture matrix and your shaders have quick and easy access to world space.

Refraction: Part 1

One Bounce Refraction

So, that engine I was working on; I made quite a bit of progress and decided to try and actual project with it. The project is real time refraction using GLSL. The first step is very quick and easy.

One bounce refraction of an infinite environment.

First, create a skybox in order to create an infinite environment and render it to the screen. Next, enable the environment map (let’s assume it is stored as a texture cube map) and activate the GLSL program. Render the refractive object and disable everything you just enabled. The shaders I used are modified from the Orange Book.

First, the vertex shader. Two varying vec3s i stores the vertex position in eye space and n stores the vertex normal in eye space.

varying vec3 i;
varying vec3 n;

void main()
{
  vec4 ecPosition  = gl_ModelViewMatrix * gl_Vertex;

  i = ecPosition.xyz / ecPosition.w;
  n = gl_NormalMatrix * gl_Normal;

  gl_Position = ftransform();
}

Now, the fragment shader. Using i and n we find the vector that refracts off the surface of the object. This vector is multiplied by gl_TextureMatrix[0] which holds the inverted modelview matrix. This converts the refracted vector from eye space into world space. Using that vector to index into the cube map gives us our final color.

uniform samplerCube texture;
uniform float indexOfRefraction;

varying vec3 i;
varying vec3 n;

void main()
{
  i = normalize(i);
  n = normalize(n);

  vec3 Refracted = refract(i, n, indexOfRefraction);
  Refracted = vec3(gl_TextureMatrix[0] * vec4(RefractR, 1.0));

  vec3 refractColor = vec3(textureCube(texture, RefractR));

  gl_FragColor   = vec4(refractColor, 1.0);
}

If you look at the Orange Book example, you see a few extra features to make the refracting object look more realistic. The first is the Fresnel Effect. This is when you view the refracting object at such an angle that you will actually see a reflection instead. Next is diffraction, or chromatic abberation. We boil it down to supplying a slightly different index of refraction for each color channel.

The vertex shader stays the same, but the fragment shader changes slightly. We find a different refraction vector for each color channel, and also a reflection vector. Look up the colors and mix them together based on the fresnel factor.

uniform samplerCube texture;
uniform vec4 indexOfRefraction; // {R, G, B, Fresnel}

varying vec3 i;
varying vec3 n;

const float FresnelPower = 5.0;

void main()
{
  i = normalize(i);
  n = normalize(n);

  float Ratio   = indexOfRefraction.a + (1.0 - indexOfRefraction.a) * pow((1.0 - dot(-i, n)), FresnelPower);

  vec3 RefractR = refract(i, n, indexOfRefraction.r);
  RefractR = vec3(gl_TextureMatrix[0] * vec4(RefractR, 1.0));

  vec3 RefractG = refract(i, n, indexOfRefraction.g);
  RefractG = vec3(gl_TextureMatrix[0] * vec4(RefractG, 1.0));

  vec3 RefractB = refract(i, n, indexOfRefraction.b);
  RefractB = vec3(gl_TextureMatrix[0] * vec4(RefractB, 1.0));

  vec3 Reflect  = reflect(i, n);
  Reflect  = vec3(gl_TextureMatrix[0] * vec4(Reflect, 1.0));

  vec3 refractColor, reflectColor;

  refractColor.r = vec3(textureCube(texture, RefractR)).r;
  refractColor.g = vec3(textureCube(texture, RefractG)).g;
  refractColor.b = vec3(textureCube(texture, RefractB)).b;

  reflectColor   = vec3(textureCube(texture, Reflect));

  vec3 color     = mix(refractColor, reflectColor, Ratio);

  gl_FragColor   = vec4(color, 1.0);
}

GL_ARB_texture_non_power_of_two

If the GL_ARB_texture_non_power_of_two extension is supported on your system, then you will be able to use textures of any size (within the limits of the system). No functions or constants are added with this extension.

Not exactly the most complicated of extensions.

GL_EXT_shadow_funcs

The GL_EXT_shadow_funcs extension requires GL_ARB_shadow and GL_ARB_depth_texture in order to have any effect. In the GL_ARB_shadow example on this site we use GL_LEQUAL when setting the GL_TEXTURE_COMPARE_FUNC_ARB texture parameter. GL_ARB_shadow allows this comparison function to be GL_GEQUAL as well, but not any of the other comparison functions.

This extension allows us to use GL_GREATER or GL_EQUAL or any of the eight comparison functions given to us by OpenGL. However, these other functions will not show you much difference. GL_LEQUAL and GL_LESS will look very similar, GL_EQUAL will most likely not be worth using, nor would any of the other comparison functions that this extension gives you. The reasons why you won’t see any difference are given in the extension specifications:

Are there issues with GL_EQUAL and GL_NOTEQUAL?

The GL_EQUAL mode (and GL_NOTEQUAL) may be difficult to obtain well-defined behavior from. This is because there is no guarantee that the divide done by the shadow mapping r/q division is going to exactly match the z/w perspective divide and depth range scale & bias used to generate depth values. Perhaps it can work in a well-defined manner in orthographic views or if you can guarantee that the texture hardware’s r/q is computed with the same hardware used to compute z/w (NVIDIA’s NV_texture_shader extension can provide such a guarantee).

Similiarly, GL_LESS and GL_GREATER are only different from GL_LEQUAL and GL_GEQUAL respectively by a single unit of depth precision which may make the difference between these modes very subtle.

So, unless you are using very specific hardware, you will most likely never need these extra functions.

GL_ARB_shadow

The GL_ARB_shadow extension allows us to create simple shadow maps. It requires the GL_ARB_depth_texture extension. For more information on the theory and implementation of shadow maps see NVidia’s hardware shadow maps document.

The theory behind using shadow maps is very simple and can be summed up in one sentence. Any point that the light can not see is in shadow. To implement this, we take a picture of the scene from the light’s point of view, then compare it to the picture of the scene from the camera’s point of view. Because we never use the color information from the light’s point of view, we can just take the depth information and store it in a texture to compare with later. This is where we need the depth texture. Note that a single depth texture can not cover the entire scene; to do this one would need to use multiple textures. Typically depth textures are used with spotlights, or in games like Guild Wars and Battlefield 2, a depth texture is used on each model that needs a shadow.

In order to speed things up, we break the world into two parts, shadow casters and shadow recievers. When we take the depth texture we make sure to only render the shadow casters. Then we render the entire scene from the camera’s viewpoint, when coloring each pixel, the 3D location of the point in camera space is converted into a 3D point in light space using a well defined method. This, in essence, will give us the distance to the light. If this distance is larger than the distance in the shadow map, then this point is in shadow, and colored appropriatly, otherwise it is colored normally.

In order to convert a point in camera space to a point in light space we simply multiply it by the various matrices that we use to render it to the screen. First, an overview. When we send a vertex into OpenGL using glVertex, it is multiplied by the camera’s modelview matrix MC. In order to get the vertex back, we must multiply by the inverse of the camera’s modelview MC-1. If everything is set up correctly, the result is the same vertex sent into OpenGL when creating the depth texture. The result is then multiplied by the light’s modelview matrix ML and the light’s projection matrix PL. This gives us the exact same result as the depth texture with one caveat. The projected points go from -1 to 1, whereas textures go from 0 to 1. This is easily overcome by scaling the result by half, then translated by 0.5. This is accomplished with the bias matrix B. Now, the point that we are considering is in texture space of the shadow map. If the input was {x, y, z, w} and the output is {s, t, r, q}, then the value of the shadow map at {s, t} is the depth that the light sees, but r is the distance from the point in quiestion to the light. So if we compare the depth in the texture to r, we know whether or not the point is in shadow. This is the reasoning for the texture property added by the extension. We will enable glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_MODE_ARB, GL_COMPARE_R_TO_TEXTURE_ARB). The way we compare is to say a point is not in shadow when the depth is less than or equal to r, it is in shadow otherwise. Hence glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_COMPARE_FUNC_ARB, GL_LEQUAL).

The matrices above are easily found. ML is given as the the matrix from the gluLookAt command, PL is given with whatever you used to set up the projection matrix (in our example, we keep the projection matrix the same between the light and camera views, so we just grab the current projection matrix). The bias matrix B is well defined and set up in the code below. The only tricky part is finding the inverse of the camera’s modelview matrix. It turns out that if we enable texture coordinate generation with GL_EYE_LINEAR, and GL_EYE_PLANE, OpenGL supplies the inverse for us.

Read more about GL_ARB_shadow

GLWindow

Download GLWindow

Before we do any hardcore OpenGL programming, we need a framework for creating a window. Sure, you could use SDL or GLUT, but I want to play around with Win32. For that, we shall go to the old standard and copy NeHe’s code. Lesson 45 has a very nice example, but that’s C code. Let’s modify that to make it C++ worthy and create a GLWindow class. So, if you download NeHe’s Win32 code, you’ll see a lot of similarities to GLWindow, but I also made my own modifications

The GLWindow class is meant to be an easy to use framework for setting up an OpenGL project in no time. It allows access to most aspects of the code, making it easily expanded. The main workflow consists of subclassing GLWindow and overriding some of it’s virtual methods. Defaults are provided, so there is no need to override all of the methods. Once a subclass is defined, you instantiate it and call mainloop() to run the program. Piece of cake. Here’s the class structure.

class GLWindow
{
public:
    GLWindow();
    ~GLWindow(void);

    void redirectIOToConsole();
    void terminateApplication();
    void toggleFullscreen();
    bool changeScreenResolution(int width, int height, int bitsPerPixel);
    bool createWindow();
    bool destroyWindow();
    LRESULT CALLBACK windowProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam);
    bool registerWindowClass();
    void mainLoop(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow);

    virtual bool init();
    virtual void deinit();
    virtual void update(DWORD elapsedTime);
    virtual void draw();
    virtual void reshape(int width, int height);
    virtual bool handleMessage(UINT uMsg, WPARAM wParam, LPARAM lParam);

    virtual void onLMouseDown();
    virtual void onLMouseUp();
    virtual void onMMouseDown();
    virtual void onMMouseUp();
    virtual void onRMouseDown();
    virtual void onRMouseUp();
    virtual void onMouseMove(int x, int y, WPARAM keys);
    virtual void onMouseWheel(WPARAM keys, int wheelDelta, int x, int y);

    char title[256];
    int starting_width, current_width;
    int starting_height, current_height;
    int bitsPerPixel;
    bool isFullScreen;

protected:
    HINSTANCE hInstance;
    HWND hWnd;
    HDC hDC;
    HGLRC hRC;
    char className[64];

    bool isProgramLooping;
    bool createFullscreen;
    bool isMessagePumpActive;

    bool isVisible;
    DWORD lastTickCount;

    bool keys[256];

private:
    void sethWnd(HWND hWND) ;
    static LRESULT CALLBACK staticWindowProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam);
};
 

See the rest of GLWindow

OpenGL Extension Helpers

A short list of OpenGL extension libraries