Virtual Image: OpenGL Programming in C++

Monday, April 7, 2014

Replacing the ModelView, Projection, and Texture Matrices

This post continues the series discussing ways I've rewritten the sample code from the book Game and Graphics Programming for iOS and Android with OpenGL ES 2.0. Below I'll discuss replacing the last portion of the author's transformation code used in the GFX module with my own library. This will be the last preparation step before I present rewriting the GFX module as a proper C++ class.

A copy of the code as described in this post can be downloaded from GitHub using the tag tstack.

Legacy OpenGL supported three transformation stacks: model-view, projection, and texture. Later OpenGL added optional support for a color transformation stack via the ARB_imaging extension. OpenGL provided a collection of functions which the programmer used to manipulate these transformation stacks. The programmer selected which stack was operated on by these functions by calling the glMatrixMode() function. This functionality has been deprecated but the functionality is still needed by graphics programmers. Using modern versions of OpenGL, such as OpenGL ES, the programmer either has to provide this functionality for himself, or use a library which provides the functionality for him.

The GFX module provided by the author mimics this legacy functionality, without support for the color transformation stack. The author's implementation limits the stack depth of the model-view, projection, and texture transformation stacks to 32 entries each. Legacy OpenGL required that the model-view stack would support a stack of at least 32 entries. The projection, and texture stacks were only required to support a stack depth of at least 2.

That the original GFX module hardcodes a stack depth of 32 for the projection and texture stacks is almost certainly a waste. For the model-view transformation stack I, personally, have never needed to have more than 32 entries, still others may need support for a deeper stack so hardcoding the model-view stack to 32 entries may be short sighted. In this post I replace these hardcoded limits replacing fixed length arrays of mat4 entries with my own transformation stack class objects. I named my transformation stack class TStack.

At its core, the TStack class is a std::vector<mat4>. By using the vector class template the transformation stack depth is only limited by the amount of memory which the application can access.

Since TStack implements a stack, it has, as one would expect, methods for pushing and popping the stack. The default constructor creates a stack with one entry containing the identity matrix. Any attempt to pop the last entry off of the stack will cause the pop() method to throw an exception.

In the author's original matrix code functions used to perform model-view, and projection stack transformations were implemented as functions which operated on mat4 objects. When I implemented my matrix classes I omitted this functionality from the mat4 class and implemented them as part of the TStack class. The operations (transformations) which one needs for the transformation stacks aren't general purpose 4D transformations; they're really 3D transformations done in 4D homogenous space. If you don't know exactly what that means, that's okay; you don't really need that level of mathematics knowledge to use the transformation stack operations. For those of you who do understand what that means I hope you'll understand (though not necessarily agree with) why I didn't include those features as part of a generic linear algebra toolset.

Below is a table showing the correspondence between the legacy (deprecated) OpenGL functions, the GFX module functions, and the TStack methods:

Legacy OpenGL	GFX	TStack
glMatrixMode	GFX_set_matrix_mode
glPushMatrix	GFX_push_matrix	push
glPopMatrix	GFX_pop_matrix	pop
glLoadMatrix	GFX_load_matrix	loadMatrix
glLoadIdentity	GFX_load_identity	loadIdentity
glMultMatrix	GFX_multiply_matrix	multiplyMatrix
glRotate	GFX_rotate	rotate
glScale	GFX_scale	scale
glTranslate	GFX_translate	translate
glFrustum		frustum
gluPerspective	GFX_set_perspective	perspective
glOrtho	GFX_ortho	ortho
gluOrtho2D	GFX_set_orthographic_2d	ortho
gluLookAt	GFX_look_at	lookAt

Looking at the GFX code modified to use the TStack methods you'll see that my version of the GFX functions often directly call the corresponding TStack method. Some cases, such as GFX_set_perspective(), are a little more complicated, but not much.

I think the easiest way for beginning 3D graphics programmers to understand the geometry transformations is as three distinct sets of operations:

Placing the vertex (or as a group, the object's vertices) into "the world" or "the scene".
Placing the camera in "the world" at a particular place, pointed in a particular direction, and rotated to a particular orientation.
Selecting a lens (such a wide angle lens, a telephoto lens, something in between, or even an orthographic projection) which the camera uses to view the world.

When an object is drawn on screen we can think of the vectors describing the geometry as being processed first by the model matrix (M), then the view matrix (V), and finally by the projection matrix (P). We might diagram the process as

v → M → V → P → screen

Mathematically, this can be written:

v * M * V * P

Notice that I've defined work for three transformation matrices but there are only two transformation stacks allocated for geometry calculations, the model-view transformation stack, and the project stack. Normally, the first transformation loaded into the model-view stack is the view matrix (i.e., the matrix V which places, and orients the camera); after loading the view matrix the various transformations which place the objects into "the scene/world" are applied on top of the view matrix; for the novice 3D programmer this should help clarify why we normally only need a "model-view" transformation stack, and a projection transformation stack.

Though there are a variety of TStack transformation methods, generally speaking if a particular transformation method is used for one of the transformation matrices (model, view, or projection), it's not typically used for any of the other transformation matrices. More advanced programmers will know how to bend the rules a bit, but when programming 3D graphics

The rotation, scaling, and translation methods are used for the model matrix (matrix M above).
The lookAt() method is generally, only used for the view matrix (matrix V above).
And the frustum(), ortho(), and perspective() methods are generally, only used for the projection matrix (matrix P above).

Again, since these matrix stack functions have been deprecated in OpenGL every programmer needs to write his own functions to replace this functionality, or adopt a library which someone else has written to provide this functionality. The author of the book provides this functionality for us in the GFX module. His functions parallel nearly, if not identically, the functions which have been deprecated in OpenGL. My TStack objects support all of the transformations which the author and OpenGL supported for 3D transformations, plus a few more. Most significantly (IMHO) the TStack object allows rotations to be described using angles in degrees, or radians, and makes it easy to use quaternions to describe rotations. The use of quaternions has efficiency implications which I'll address later in this post.

Since the TStack object implements all of the necessary 3D graphics matrix methods I was able to delete the author's mat3_copy_mat4(), mat4_translate(), mat4_scale(), mat4_rotate() and mat4_ortho() functions.

Now let's see how the GFX module's new functionality supported by the TStack class can make the application code more efficient.

Going back to the original implementation of the chapter7-1 sample program there was this sequence of statements:

GFX_load_identity();

GFX_translate(eye_location.x,

eye_location.y,

eye_location.z);

GFX_rotate(rotz, 0.0f, 0.0f, 1.0f);

GFX_rotate(90.0f, 1.0f, 0.0f, 0.0f);

mat4_invert(GF_get_modelview_matrix());

The first optimization I performed in an earlier post was to eliminate the matrix inversion by reversing the order of the translation, and rotations, and by reversing their respective directions. The new code became:

GFX_load_identity();

GFX_rotate(-90.0f, 1.0f, 0.0f, 0.0f);

GFX_rotate(-rotz, 0.0f, 0.0f, 1.0f);

GFX_translate(-eye_location.x,

-eye_location.y,

-eye_location.z);

More recently, I added a version of GFX_translate() which would take a vec3 argument rather than 3 scalars. This doesn't make the code faster but (IMHO) it does make it easier to write. The code became:

GFX_load_identity();

GFX_rotate(-90.0f, 1.0f, 0.0f, 0.0f);

GFX_rotate(-rotz, 0.0f, 0.0f, 1.0f);

GFX_translate(-eye_location);

There's still another optimization which can be made. According to Euler's rotation theorem, two, or more, consecutive rotations can be described as a single rotation. The code has two consecutive rotations, so the question becomes how can this calculation be done using a single rotation? The easiest way I know of is to express the two rotations as quaternions. These two calls to GFX_rotate() have the necessary pieces: an angle of rotation, and a rotation vector of length 1. To create the rotation quaternion we first divide the rotation angle by 2, and compute the sine and cosine of this half-angle. The cosine becomes the real part of the quaternion; sine times the normalized rotation vector becomes the imaginary (i.e., i, j, and k) portion of the quaternion. Since the first rotation uses an angle which is a constant, and a rotation vector which is a constant, the quaternion representing the rotation is also a constant. The cosine of -45° (remember, half of the angle -90°) is 1 divided by the square root of 2, also written as the square root of 2 divided by 2. The C/C++ math library has this value pre-defined as the constant M_SQRT1_2. The sine of -45° is the negative of that value. The first rotation can be rewritten as

static const quaternion q(M_SQRT1_2, -M_SQRT1_2, 0.0f, 0.0f);

GFX_rotate(q);

Note that since we were able to precompute the sine and cosine values the code is already more efficient because calculating trigonometric functions (sine and cosine) is expensive.

Computing the quaternion for the second rotation is almost as easy. I've added a new constant, DEG_TO_RAD_DIV_2, to the SDK/common/types.h to simplify the conversion of old GFX_rotate() calls into GFX_rotate(const quaternion &) calls:

float alpha(-rotz*DEG_TO_RAD_DIV_2);

float cosAlpha(cosf(alpha)), sinAlpha(sinf(alpha));

GFX_rotate(quaternion(cosAlpha), 0.0f, 0.0f, sinAlpha));

This works but it still leaves us with two rotations. We could take the product of the quaternions dynamically by doing something like this:

static const quaternion q0(M_SQRT1_2, -M_SQRT1_2, 0.0f, 0.0f);

quaternion q1(cosAlpha), 0.0f, 0.0f, sinAlpha));

GFX_rotate(q0 * q1);

This gets us down to a single rotation but every time the code hits this spot we have to do a quaternion multiplication. Each general purpose quaternion multiplication costs 16 floating point multiplications, and 12 floating point additions/subtractions*.

* For performance purposes I'll refer to a floating point additions/substractions as just floating point additions, since floating point addition and floating point subtraction have the same performance cost and it makes the text easier to read/write. This is a common way of discussing floating point performance or cost.

There are only 4 non-zero components in these two quaternions so it should be pretty straight forward to calculate, in advance, a quaternion that represents the product of these two quaternions. Doing so the code becomes:

float alpha(-rotz*DEG_TO_RAD_DIV_2);

float cosAlpha(cosf(alpha)), sinAlpha(sinf(alpha));

GFX_rotate(quaternion( M_SQRT1_2*cosAlpha,

-M_SQRT1_2*cosAlpha,

M_SQRT1_2*sinAlpha,

M_SQRT1_2*sinAlpha));

Now, instead of performing a quaternion multiply at the cost of 16 multiplications, and 12 additions, the code can do the same work with only 4 multiplications (a really smart optimizing compiler might only do two multiplications); no additions are needed. Plus, remember that we were able to avoid computing sine and cosine for one of the angles.

The final version of the code is now

GFX_load_identity();

float alpha(-rotz*DEG_TO_RAD_DIV_2);

float cosAlpha(cosf(alpha)), sinAlpha(sinf(alpha));

GFX_rotate(quaternion( M_SQRT1_2*cosAlpha,

-M_SQRT1_2*cosAlpha,

M_SQRT1_2*sinAlpha,

M_SQRT1_2*sinAlpha));

GFX_translate(-eye_location);

What used to be a translation, two rotations, and a matrix inversion has been simplified to a single rotation, followed by a translation.

There are numerous places throughout the sample code for the book which have consecutive rotations which can be converted into a single rotation. To simplify the task of precomputing the product of rotation quaternions for use in the code, I've written a utility program which takes two strings representing quaternions, and produces the product. For example, in the create_direction_vector() function I found in SDK/common/utils.cpp there were three consecutive rotations. Rather than multiply 3 quaternions by hand I was able to feed them into the utility program, and let it do the hard work:

This program expects to read quaternions as a newline

terminated list of 4 components.  The components must

be separated by commas.

The program can be exited by sending EOF.  For UNIX

machines (including Mac OS X), the default way of

doing this is to type ctrl-D; Windows machines treat

ctrl-Z as EOF, by default.

Enter first quaternion:  cosBeta,0,sinBeta,0

Enter second quaternion:  cosGamma,sinGamma,0,0

cosBeta*cosGamma,cosBeta*sinGamma,sinBeta*cosGamma,-sinBeta*sinGamma

Enter first quaternion:  cosAlpha,0,0,sinAlpha

Enter second quaternion:  cosBeta*cosGamma,cosBeta*sinGamma,sinBeta*cosGamma,-sinBeta*sinGamma

(cosAlpha*cosBeta*cosGamma+sinAlpha*sinBeta*sinGamma),(cosAlpha*cosBeta*sinGamma-sinAlpha*sinBeta*cosGamma),(cosAlpha*sinBeta*cosGamma+sinAlpha*cosBeta*sinGamma),(-cosAlpha*sinBeta*sinGamma+sinAlpha*cosBeta*cosGamma)

Enter first quaternion:  

Using the output from this utility the revised version of create_direction_vector() performs a single rotation:

// Convert all angles to radians & divide by 2.
float alpha = rotz*DEG_TO_RAD_DIV_2;
float cosAlpha(cosf(alpha)), sinAlpha(sinf(alpha));
float beta = roty*DEG_TO_RAD_DIV_2;
float cosBeta(cosf(beta)), sinBeta(sinf(beta));
float gamma = rotx*DEG_TO_RAD_DIV_2;
float cosGamma(cosf(gamma)), sinGamma(sinf(gamma));
float cAcB(cosAlpha*cosBeta);
float sAsB(sinAlpha*sinBeta);
float cAsB(cosAlpha*sinBeta);
float sAcB(sinAlpha*cosBeta);
l.loadRotation(quaternion(cAcB*cosGamma+sAsB*sinGamma,
cAcB*sinGamma-sAsB*cosGamma,
cAsB*cosGamma+sAcB*sinGamma,
sAcB*cosGamma-cAsB*sinGamma));

Three rotations have been reduced to 12 floating multiplies, 4 floating point additions, and one rotation. And, by using a utility to do the complex manipulation of the quaternion multiplication I greatly reduced the chance that I'd make a mistake.

This utility program is available at GitHub.

While I have not yet rewritten the GFX module as a full blown object class I did make one other major change in that direction. I made the old GFX_start() function into the GFX constructor. This was needed to get the TStack objects to initialize correctly.

Next post, I'll finally convert the GFX module into a full C++ object class.

Sunday, January 5, 2014

Replacing the Math Library

This post continues my series about rewriting the code from Game and Graphics Programming for iOS and Android with OpenGL ES 2.0 but heads off in a slightly different direction. In this post, instead of rewriting one of the the author's modules for game development into a proper C++ class I'm introducing my own linear algebra tools.

A copy of the code as described in this post can be downloaded from GitHub using the tag glml.

If you're familiar with 3D graphics programming you're already aware that the geometric description of objects, placement of objects, placement of the "camera" or "eye", the calculation of perspective, etc., are all done using linear algebra. Being able to write linear algebra tools using operator overloading, and class templates can simplify the expression of the equations needed for 3D graphics. I have written my own math library which defines vector, matrix, and quaternion types. Because I wanted a math library which would useful for a broad range of math programming in other areas of numerical programming, I have written my routines to only support operations that would normally be found in generic linear algebra, and quaternion text books; that is, I have left out operations which are unique to 3D graphics programming, and, therefore, are not generally useful outside of that particular problem set*.

* Okay, I'll admit that, at times, I have wandered from the "one true path", and included extra features. More on this later.

My list of required features for a math library includes:

vector operations:

addition,
subtraction,
unary minus,
multiplication by

scalars, and
matrices,

vector dot product,
vector cross product (3‑tuples only),
division by scalars,
equality/inequality testing, and
output operator

matrix operations:

addition,
subtraction,
unary minus,
multiplication by

scalars,
vectors, and
other matrices

division by scalars,
equality/inequality testing,
output operator,
square matrix methods:

determinant calculation,
adjoint calculation, and
matrix inversion

Because I wrote my math tools with the idea that they should be generally useful for writing linear algebra equations I've done things differently than OpenGL does. In particular, when I wrote the definitions of my vector, and matrix data types, I implemented them as "row major" rather than "column major" arrays.

FORTRAN is an example of a programming language which supports column major arrays. In FORTRAN, if you declare a 2‑dimensional array using

REAL A(2,3)

the FORTRAN compiler lays this out in memory by putting the item A(1,1) at the base of the memory space allocated for A.† The next sequential location in memory is occupied by A(2,1), then, since we've exhausted the elements of the first column, the next item in memory is A(1,2). Column major arrays make me a little crazy. They're like odometers from Wonderland, or something. They may work but (IMHO) they do so in the least obvious way.

† For those who are unfamiliar with FORTRAN, FORTRAN array indices start at 1 (one) by default, unlike C arrays which always start at 0 (zero).

Technically, C (and by inheritance C++) doesn't have multidimensional arrays but one can construct an array of arrays (But you already knew that right? Forgive me for being pedantic.) If I define an array of arrays in C using

float a[2][3];

C lays this out in memory by putting the item a[0][0] at the first memory location allocated for the array. The next sequential location in memory is occupied by a[0][1], followed by a[0][2], a[1][0], etc. If I declare and initialize a two-dimensional array‡ to represent a matrix using the following code

const float alpha = M_PI_4;
float M[2][2] = { { cos(alpha), -sin(alpha) },
{ sin(alpha), cos(alpha) } };

the initialization of the matrix is done in the same order as one would write the matrix for a math paper:

cos(α)	‑sin(α)
sin(α)	cos(α)

and, because using a row major array layout is more like the way mathematicians write matrices, I think this makes the matrices easier to use.

‡ I know, I know. C doesn't really have multi-dimensional arrays. It has arrays of arrays but I find the text so much easier to write talking about two-dimensional arrays rather than talking about an array of arrays. Indulge me.

OpenGL, on the other hand, follows the FORTRAN convention of creating column major arrays. The same array rewritten in column major form is

float M[2][2] = { { cos(alpha), sin(alpha) },
{ -sin(alpha), cos(alpha) } };

which, in mathematical terms is the transpose of the first representation. That is, the rows of one matrix become the columns of the other matrix, and vice versa. This has implications for how one uses the matrices to write mathematical expressions. If I take two matrices, A and B, and multiply them, and then take the transpose of the product mathematicians write the expression this way:

(AB)^T

and (without offering any sort of proof other than my simple, bold assertion) the transpose of the product of A and B is

(AB)^T = B^TA^T

that is, the transpose of the product of matrices A and B is, the product of the transpose of B times the transpose of A.

Vectors can be thought of as either (or both) one row matrices, or one column matrices. If v is a one row matrix then

(vM)^T = M^Tv^T

What this means is that if we multiply a vector times a matrix and take the transpose we get the same result as if we multiplied the transpose of the matrix by the transpose of the vector.

What does this mean when using my math tools with OpenGL? It means that when I write C++ code which looks like:

v = v * M;

or, alternatively,

v *= M;

after v and M are passed to a shader program written in the OpenGL Shader Language (GLSL) the equivalent calculation of these two expressions needs to be rewritten in the shader program as

v = M * v;

This was a very long prologue to make sure you understand that when generating matrix equations in C++ using my tools that the order of multiplication is the reverse of the order of multiplication using GLSL. The order of addition, and subtraction aren't affected by this implementation detail; using either my tools or GLSL the expression matrix A minus the matrix B is written "A - B". If you find this confusing you could download my tools and rewrite them so that the order of multiplication matches the order in GLSL. To repeat, I implemented the tools as row major arrays because I wanted a math library that would be useful in other problem domains, and, IMHO, using row major matrices is more directly related to the conventions that are used to teach linear algebra. That is, most people who use vectors, and matrices already have their brains "prewired" to view matrices this way.

I don't know what design considerations the author had when he chose to create his vec2, vec3, vec4, mat3, and mat4 data types (defined in SDK/common/types.h) but these type names are the same names used for the corresponding data types in GLSL so it's likely that he borrowed the names from GLSL for consistency (as did I). For the 2‑dimensional vector type the author's definition is:

typedef struct
{
float x;
float y;
} vec2;

Originally, I declared my vec2 type using something like:

class vec2 {
private:
float _v[2];
public:
int nElems() const { return 2; }
.
.
.
};

My original vec3 class was similar:

class vec3 {
private:
float _v[3];
public:
int nElems() const { return 3; }
.
.
.
};

The reason I choose to use an array to hold the elements (rather than individual variables named x, y, z, and w) was to simplify the task of writing various class methods. For example, my original implementation of the dotProduct() method for the vec2 class was:

float vec2::dotProduct(const vec2 &rhs) {
float tmp = 0.0;
for (int i=0; i!=nElems(); ++i)
tmp += _v[i] * rhs._v[i];
return tmp;
}

Similarly for vec3:

float vec3::dotProduct(const vec3 &rhs) {
float tmp = 0.0;
for (int i=0; i!=nElems(); ++i)
tmp += _v[i] * rhs._v[i];
return tmp;
}

You can see by looking at the blue text in the two dotProduct() methods above that the code inside the class methods was virtually identical; they differed only in the choice of data types for the operands.

I had similar definitions for the matrix types. For example, the mat2 class was something like:

class mat2 {
private:
vec2 _m[2];
public:
int nRows() const { return 2; }
int nCols() const { return 2; }
mat2 transpose() {
mat2 tmp;
for (int i=0; i!=nRows(); ++i) {
for (int j=0; j!=nCols(); ++j)
tmp._m[j][i] = _m[i][j];
}
return tmp;
}
.
.
.
};

Because each vector type had around 2 dozen class methods (not counting constructors, destructor, copy constructor, and assignment operator) it was all I could do to write 3 vector classes (vec2, vec3, and vec4), 3 matrix classes (mat2, mat3, and mat4), and the methods implementing their various interactions with each other, much less deal with non-square matrices (mat2x3, mat2x4, mat3x2, etc.) and the various interactions between the myriad of data types.

Eventually, because all of the methods were virtually identical I rewrote my vector and matrix types, using C++ class templates. The templates, in turn, are literally able to generate hundreds of different versions for any dimensionality of vector, and/or matrix I wanted/needed.

In the end I came up with four class template types to implement the various vector and matrix types:

one template which is the base class for all vectors (vec),
a second vector template (which is a subclass of the original vector template) which adds the cross product method to 3‑tuples (cpvec),
one template for matrices which is the base class for all matrix classes (mat), and
a matrix template (which subclasses the first matrix template) for square matrices adding the division ("/=" and "/"), loadIdentity(), determinant(), adjoint(), and inverse() methods (matNxN).

The templates take parameters for dimensionality, and for the element data type (such as float or double). Using templates I even defined the interactions between types (such as multiplying a vector by a matrix).

There are also template functions for

vector, matrix, and quaternion output, and
various interpolation methods.

One thing I do like about the author's original vec2, vec3, and vec4 data types is that accessing the X-, Y-, Z-, and W-coordinates using his types reads better than it did using my old implementation. For example, setting the X-coordinate of a vector variable v to zero using the author's data types can be expressed as:

v.x = 0.0;

but, using my (original) vector classes based on arrays, setting the X-coordinate to zero looks like:

v[0] = 0.0;

By overloading the "‑>" operator I was able to do (what I hope is) the next best thing. The X‑component, can be accessed as follows:

v‑>x = 0.0;

My math tools include the definition of a template for a quaternion class, qTemplate. For those of you who are unfamiliar with quaternions, and, in particular how they're used in 3D graphics applications to rotate objects in space I'll start with an explanation of what a quaternion is mathematically. One can think of quaternions as 4‑dimensional vectors. Normally, the first element represents the "real number" portion of quaternion. The remaining 3 components of a quaternion are usually referred to as the i, j, and k parts, respectively. In math class we all learned that there is no real square root for negative numbers. However, that didn't keep mathematicians from coming up with a way to represent the square root of negative numbers. Even before a mathematician devised quaternions there was a portion of mathematics dealing with what mathematicians call complex numbers. These numbers are "complex" in the sense that they have two parts: a real part, and an imaginary part. Calling the second component of a complex number "imaginary" is an unfortunate choice of adjectives. The, so-called, "imaginary numbers" are as real as the "real numbers"; they have their own set of mathematical rules which they follow consistently just like real numbers. The "imaginary" part of a complex number is defined to be multiplied by i. Mathematicians defined i to be the square root of -1 (negative one), that is, i multiplied by i, or i², is equal to -1; likewise, (-i)² is also equal to -1. Complex numbers can be written the same way mathematicians write vectors, for example the complex number with its real part equal to 3 and its imaginary part equal to 4 can be written as

(3, 4)

but this number can also be written as

3 + 4i

that is, the real number 3 plus 4 times the square root of -1.

Quaternions take complex numbers even farther. In the quaternion number system i isn't the only square root of -1; there are also j, and k. That is,

i² = j² = k² = -1

For real numbers, (and even for the complex numbers) multiplication is commutative, that is, if I multiply 3 times 5 I get the same result as if I multiplied 5 times 3. In both cases, the product is 15. The i, j, and k components of quaternions don't act that way. Instead we have the following set of rules:

i × j = k,
j × k = i,
k × i = j,
j × i = -k,
k × j = -i, and
i × k = -j

This is where my math library diverges more significantly from the math library provided by the author. While the author defines a quat data type in SDK/common/types.h, he doesn't use it anywhere in the code which accompanies the book. Instead he uses the data type vec4 when the code needs to use a quaternion type. In terms of storage, both types are 4‑tuples (see above where I mentioned that quaternions can be thought of as 4‑dimensional vectors). But normally, one thinks of vectors as representing a location or a direction; they can be added, subtracted, and multiplied by a scalar, but there isn't a "normal" multiplication operator where one can multiply a vector by a vector and get a vector as a result. When one multiplies two vectors to calculate the dot product the value of the product is a scalar, not a vector like the two operands. It's true that 3‑dimensional vectors (aka 3‑tuples) have a cross product operation for which the product of two vectors is also a vector, but, in general, vectors don't have a multiplication operation which can be used to multiply two vectors resulting in a product which is also a vector. Quaternions, unlike ordinary 4‑dimensional vectors, have a multiplication operation which results in a quaternion product.

[Note: "Normal" vector types (vec2/fvec2, vec3/fvec3, vec4/fvec4, dvec2, dvec3, and dvec4) use the aliases "x", "y", "z", and "w" for the their first, second, third, and fourth elements, respectively; these aliases are used with the "‑>" operator. For the quaternion class the real (as in "real number(s)"), i, j, and k components can be accessed with the "‑>" operator using the aliases "r", "i", "j", and "k", respectively. IMHO this makes the code easier to read by making it easier for the reader to differentiate between ordinary vectors and quaternions.]

To incorporate my math library into the author's code I deleted the data type declarations vec2, vec3, vec4, mat3, mat4, and quat from the file SDK/common/types.h. Then I

created a new subdirectory in the directory SDK/common named glml,
copied the files glml.cpp, and glml.h into this new directory,
in each .xcodeproj file for the sample programs I added (File → Add Files to …) the files SDK/common/glml/glml.cpp, and SDK/common/glml/glml.h to the projects, and
tested the changes by compiling, and running the programs using the new math library.

To get the programs to compile with my math library I needed to change the way that vector, and quaternion types were initialized. Previously, the vec2, vec3, and vec4 data types had been initialized using the same syntax as one would use to initialize an array. Any vec2, vec3, or vec4 data type which needs to be initialized now uses an appropriate constructor. So, in the sample program chapter2‑2 the lines

vec3 e = { 0.0f, -3.0f, 0.0f },
c = { 0.0f, 0.0f, 0.0f },
u = { 0.0f, 0.0f, 1.0f };

are now

vec3 e(0.0f, -3.0f, 0.0f),
c(0.0f, 0.0f, 0.0f),
u(0.0f, 0.0f, 1.0f);

This also affects the initialization of the vec2, vec3, and vec4 data member initialization for the various modules. For example, in the SDK/common/obj.h file the ambient data member of the OBJMATERIAL class was declared as:

vec4 ambient = { 0, 0, 0, 0 };

and now its declaration is

vec4 ambient;

This change, by itself, would leave ambient uninitialized when a new OBJMATERIAL instance is created (the default constructors for my vector and matrix types don't do anything). To fix this I added "ambient(0,0,0,0), " to the initializer list for the OBJMATERIAL constructor definition.

The vector data types which I've added to the code base can be operated on by the normal arithmetic operators: "+", "-", "*", and "/". This means that much of the rest of the code can be rewritten to simplify the expression of the various vector calculations.

The original code had defined the functions vec3_add(), and vec3_diff() to perform addition and subtraction on vec3 objects. The same can be done now directly using the "+", and "-" operators. For example, in the original OBJ_load() function there were the following lines of code:

vec3_add(&this‑>indexed_normal[objtriangleindex‑>vertex_index[0]],
&this‑>indexed_normal[objtriangleindex‑>vertex_index[0]],
&normal);

vec3_add(&this‑>indexed_normal[objtriangleindex‑>vertex_index[1]],
&this‑>indexed_normal[objtriangleindex‑>vertex_index[1]],
&normal);

vec3_add(&this‑>indexed_normal[objtriangleindex‑>vertex_index[1]],
&this‑>indexed_normal[objtriangleindex‑>vertex_index[1]],
&normal);

Using the new definition of the vec3 class these lines of code in the OBJ::load() method have become

this‑>indexed_normal[objtriangleindex‑>vertex_index[0]] += normal;

this‑>indexed_normal[objtriangleindex‑>vertex_index[1]] += normal;

this‑>indexed_normal[objtriangleindex‑>vertex_index[2]] += normal;

Calls to mat4_multiply_mat4() can be replaced with simple equations so that

mat4_multiply_mat4(&projector_matrix,
&projector_matrix,
&GFX_get_modelview_projection_matrix());

becomes

projector_matrix =
GFX_get_modelview_projection_matrix() * projector_matrix;

[Note: This is one of those cases where, because my linear algebra tools use row major matrices, the order of the matrix multiplication changes.]

Similarly, calls to mat4_copy_mat4() such as

mat4_copy_mat4(&projector_matrix_copy, &projector_matrix);

can be replaced with the code

projector_matrix_copy = projector_matrix;

And there are even more complicated statements which can be simplified. The statements

tangent.x = (v1.x * uv2.y + v2.x * u1.y) * c;
tangent.y = (v1.y * uv2.y + v2.y * u1.y) * c;

tangent.z = (v1.z * uv2.y + v2.z * u1.y) * c;

can now be rewritten as the single line of code

tangent = (v1 * uv2‑>y + v2 * u1‑>y) * c;

The original multi-line version of the code is a common example of code which I mess up. I start by typing in the first line and then duplicate it twice. Then I have to change each occurrence of x in the second line to y, and, likewise, in the third copy of the line I need to change the occurrences of x to z. Somewhere in the process I lose track of where I'm at, making a mistake, or two, along the way, but still end up with syntactically valid C/C++ code which compiles but calculates the wrong thing.

To illustrate how common this kind of error in repetitive code is, I recently found such a bug in the author's original implementation of the box_in_frustum() function in the file SDK/common/utils.cpp. There are eight ways to add and subtract the X-, Y-, and Z‑components of the vec3 variables location and dimension. One of the ways was used twice, and this caused one of the required combinations to be omitted resulting in a bug. The bug is now fixed in the copy of the code at GitHub. The point isn't that the author made a mistake. The point is I've made this kind of mistake myself on a lot of occasions and I think that by having vector data types which the programmer can manipulate with the normal arithmetic operators makes it easier for the programmer to avoid these kinds of errors during the performance of repetitive tasks.

While I'm talking about the SDK/common/utils.cpp file, please look at the build_frustum() function. I replaced the original code to calculate the product of two matrices with a set of nested loops where the original version of the code calculated the product using only sequential statements. While this is a relatively noticeable difference there is something more significant in the way I wrote the loops. The variable c no longer holds the product of the modelview and projection matrices; it now holds the transpose of their product. The old code used the columns of the old version of the c variable to calculate the vectors which describe the six planes which bound the viewing frustum. By changing the calculation of c so that it holds the transpose of the product, I can use the rows of c to calculate the six bounding planes. Being able to use the rows is convenient because each row is an instance of the vec class template type, and, therefore, inherits all of vec's arithmetic operators allowing me to simplify the code for calculating the frustum bounding planes. I hope you'll find that the new version of the code is much more elegant, readable, and maintainable.

The vector class types have constructors which allow one vector type to be converted into another. I was careful to declare the constructors which can be used for vector type conversion as explicit; I don't want the compiler freely converting one vector type into another without my knowledge because it could mask errors. I've used these type converting constructors to simplify the code. For example, in the Chapter 10 sample programs, and in the LIGHT module there were these two lines of code:

memcpy(&this‑>position, &position, sizeof(vec3));
this‑>position.w = 1.0f;

Using one of the type conversion constructors this code is now:

this‑>position = vec4(position, 1.0f);

Finally, I've used these constructors to replace calls to vec3_multiply_mat4(). For example, again in the Chapter 10 sample programs and in the LIGHT module there were the lines:

vec3_multiply_mat4(direction,
&this‑>direction,
m);

This code was rewritten as

direction = vec3(vec4(this‑>direction, 0.0f) * m));

There is an alternate way to solve this problem. I could have written my own operator to multiply a vec3 variable by a mat4 variable and return a vec3 result such as:

const vec3 operator*(const vec3 &lhs, const mat4 &rhs) {
vec3 tmp;
for (int i=0; i!= tmp.nElems(); ++i) {
tmp[i] = lhs[0] * rhs[0][i];
for (int j=1; j!=lhs.nElems(); j++)
tmp[i] += lhs[j] * rhs[j][i];
}
return tmp;
}

Then the code could simply have been written as

direction = this->direction * m;

I may write such an operator someday but I have avoided it because it does two implicit data type conversions — converting the vec3 lhs operand into a vec4 by adding a fourth component of zero, and then converting the vec4 product into a vec3 by truncating the fourth component — and I'm still feeling cautious about allowing implicit type conversions.

There's another reason to avoid using such a specialized operator. The vec3_multiply_mat4() function is the source of a bug in the sample programs chapter4‑5, chapter5‑3, chapter5‑4, and chapter5‑5. Each of these programs has code like the following:

else if (!strcmp(program->uniform_array[i].name, "LIGHTPOSITION")) {
vec3 position = { 0.0f, -3.0, 10.0f };
vec3 eyeposition = { 0.0f, 0.0f, 0.0f };

vec3_multiply_mat4(&eyeposition,
&position,
&gfx.modelview_matrix[gfx.modelview_matrix_index‑1]);

glUniform3fv(program->uniform_array[i].location,
1,
(float *)&eyeposition);
}

The bug here is that variable position isn't treated like a position at all; it's treated like a direction because the function vec3_multiply_mat4() treats position's fourth component as a zero rather than as a one. (Remember that we use 4‑dimensional vectors/matrices because our geometry calculations are done in homogeneous space.) When vec3_multiply_mat4() multiplies position by the model view matrix, any translations which have been accumulated in the model view matrix will be lost. To fix the problem I used the following code:

vec4 position (0.0f, -3.0f, 10.0f, 1.0f);
vec3 eyeposition(0.0f, 0.0f, 0.0f);

eyeposition = vec3(position *
gfx.modelview_matrix[fx.modelview_matrix_index‑1]);

The picture in the Kindle edition of the book (Figure 4‑6) now matches what I see on my screen when I use this fix. It would appear that the picture in the book is out of sync with the sample code.

This bug illustrates why an operator to multiply a vec3 variable by a mat4 variable is a bad idea. In some cases the vec3 variable needs to be treated as a direction (using a fourth component which is zero), and in other cases it needs to be treated as a location (with a fourth component which is non-zero). If I had defined such a multiplication operator there is a good chance that at least part of the time the behavior of the multiplication operator would treat the vec3 variable incorrectly; there is no obvious correct behavior to cover all cases, just like when using the old vec3_multiply_mat4() function.

[Note: Some of you may have done a quick code analysis and determined (correctly) that my version of the code is less efficient than the original code. I will address this point in a later post.]

Please note I haven't replaced/removed all of the original matrix code. The reason for this goes back to what I said earlier about leaving things out of my linear algebra tools that are specific to a particular problem domain. The author's matrix code includes functions to perform 3D translations, 3D rotations, and 3D scaling using 4D matrices. Additionally, there are matrix operations to set up where the OpenGL camera/eye is located, and to define the viewing frustum. These are not generic matrix operations. They are specific to a particular problem domain (3D graphics) so I have not implemented them as part of my general purpose linear algebra tools. If I had implemented these tools for my 4×4 matrix type these tools would have been conspicuously absent from the tools for the 2×2 and 3×3 matrix data classes, and, in my opinion, this would create an awkward lack of uniformity across the square matrix types. The author included these tools as part of his math library (I presume) because he was specifically solving 3D graphics problems. The author's approach isn't wrong but it's different from my own. I will introduce my own tools for supporting these calculations in a future post. Until then the code will continue to use the author's tools for these operations.

The next change is more invasive. In the definition of the of the MD5JOINT class in the file SDK/common/md5.h I changed the type of the data member rotation from vec4 to quaternion. I had mentioned in a previous post that (IMHO) this data member was declared to have the wrong type. The author had declared a quat data type in SDK/common/types.h but had never used it. Instead, for any data member which looked like a four element data type he simply used the type vec4. I don't know why the code was written that way. There are several clues that, in some cases, when the original code uses the vec4 type type the data being operated on is really a quaternion. The first I noticed was the function vec4_conjugate(). One doesn't normally think of an ordinary 4‑dimensional vector as having a conjugate, but quaternions do have a conjugate. The next function which caused me to suspect that some variables declared to be type vec4 were really quaternions was the function vec3_rotate_vec4() because I know that quaternions are often used to rotate objects in 3‑dimensional space. vec3_rotate_vec4() calls vec4_conjugate(), vec4_multiply_vec3(), and vec4_multiply_vec4(). When I examined the vec4_multiply_vec3() function I could see that the vec3 data passed in was being promoted (that is, undergoing a type transformation) to a quaternion, and that the operation being carried out by the function was the first step in using a quaternion to rotate an object in 3‑dimensional space. The function vec4_multiply_vec4() was finishing the final step in using a quaternion to rotate an object in 3‑dimensional space. The last step in vec3_rotate_vec4() uses memcpy() to extract the 3D vector representing the x, y, and z components of the vec4 variable and place them in a vec3 variable. All of these things led me to the conclusion that the rotation data member of the MD5JOINT class is really a quaternion.

Which brings me back to explaining why I used the the values "{ 0, 0, 0, 1 }", or, using my quaternion implemention, "(1, 0, 0, 0)" to initialize the rotation data member. Rotations in 3D space are defined as an axis of rotation (which is a 3D vector), and an angle of rotation. If the code is using an angle of rotation of θ, then let α be equal to one half θ. Viewed another way:

2α = θ

Now take a look at the vector used as the axis of rotation. Assume the length of the vector is 1 (one) because if the length of the vector is any non-zero value, it's easy to divide the vector by its length and make it have a length of 1. Alternatively, if the length of the vector is zero, it's not a valid rotation vector (unless θ is also zero which we'll come to shortly). If I name the vector used as the rotation vector v the quaternion which represents rotation through the angle θ, about the vector v is

cos(θ/2) + v_xsin(θ/2)i + v_ysin(θ/2)j + v_zsin(θ/2)k

cos(α) + v_xsin(α)i + v_ysin(α)j + v_zsin(α)k

This quaternion has a length of 1 (one); all quaternions representing rotations have length 1. If θ is zero, then the quaternion doesn't do any rotation. And if θ is zero, then α (which is half of θ) is half of zero, or just zero. The cosine of zero is 1, and the sine of zero is zero so the quaternion representing a rotation through zero degrees (or zero radians since they are equivalent) is

1 + 0i + 0j + 0k

which why I picked the initialization values of "{ 0, 0, 0, 1 }" and/or "(1, 0, 0, 0)" to initialize quaternion variables in the various constructors. This quaternion value represents "do no rotation", that is, it represents an identity value. One last bit of important quaternion knowledge as it applies to 3D graphics: the quaternion ‑q represents the same rotation as the quaternion q, so

-1 + 0i + 0j + 0k

also represents "do no rotation".

My custom when writing a quaternion places the real component first, followed by the i, j, and k components. But, as they say, the good thing about standards is that there are so many to choose from. The author has chosen to use a different, but equally valid, way of writing quaternions. He represents the real portion of the quaternion with the w component of the vec4 data type. He has assigned the i, j, and k portions to the x, y, and z components, respectively. This usage is entirely consistent with the way quaternions are used in 3D graphics applications to perform rotations. This is one of those things where the choices aren't right or wrong, but personal preference. Where the author has defined the real portion of the quaternion as the last element of a 4‑tuple, I have defined it as the first, and I have defined the i, j, and k components to be members of a 3‑tuple embedded within the quaternion.

Using the author's method, if the program defined a variable q as type vec4 (or quat), the code would access the real part using q.w, the i part using q.x, the j part using q.y, and the k part using q.z.

Using my definition for quaternions the program would access the real part using the alias q‑>r, the i part using q‑>i, the j part using q‑>j, and the k part using q‑>k. Also, because my quaternion implementation makes the real part, and the imaginary vector public the code could, alternatively, access the portions of the quaternion using q.w, q.v‑>x, q.v‑>y, and q.v‑>z; this second method of accessing the components is more like the author's method. I've given the programmer a choice of naming conventions. Pick your poison!

Another change you'll see in this version of the code is that more of the function parameter lists have been changed to pass in references to variables rather than pointers. I discussed some of the benefits of this in my post about rewriting the MD5 module. Now that I've overloaded the "‑>" operator to provide named access to the components of vec2, vec3, vec4, and quaternion data types this has an additional benefit. For example, if you look at the function quat_build_r() in SDK/common/vector.cpp (formerly vec4_build_w()) you'll see that the v parameter is passed in by reference. Using the old vec4_build_w() method of passing in a pointer to the quaternion the code might look something like

float l = 1.0f - ((*q)‑>i * (*q)‑>i) -
((*q)‑>j * (*q)‑>j) -
((*q)‑>k * (*q)‑>k);

Now that the quaternion objects are passed in by reference I think it makes the code easier to read:

float l = 1.0f - (q‑>i * q‑>i) -
(q‑>j * q‑>j) -
(q‑>k * q‑>k);

As I mentioned earlier I tried to limit the implementation of my linear algebra tools to functions which are of general use. However, I've added functions to both the vec and mat templates which may be a violation of that rule. In order to pass uniform variable values into shader programs OpenGL has a number of glUniform() functions. For example, the sample program chapter8‑5 uses the following code to pass color information to the GLSL shader program:

glUniform3fv(path_point‑>get_uniform_location((char *)"COLOR"),
1,
(float *)color);

In this case color is declared to be type "vec3 *"; glUniform3fv() requires that its third parameter be a pointer to float, hence the need to cast color's type. Since OpenGL routinely requires vector information to be passed into its routines as "float *", I added the method v() to the vec class template to simplify this process. The above call to glUniform3fv() can be rewritten

glUniform3fv(path_point‑>get_uniform_location((char *)"COLOR"),
1,
color‑>v());

The matrix types have a similar class method but its name is m() and it serves the same purpose. For example, in the same sample program, chapter8‑5, the modelview-projection matrix can be passed to the shader program using:

glUniformMatrix4fv(path_point‑>get_uniform_location((char *)"MODELVIEWPROJECTIONMATRIX"),
1,
GFX_get_modelview_projection_matrix().m());

The MD5 module uses three different types of interpolation to support animation. The simplest type is linear interpolation. The MD5 module only ever does linear interpolation on vec3 data but linear interpolation is a very common task. Consequently, my approach isn't to write an individual linear interpolation routine to support a specific data type. Instead, I wrote a function template named linterp() which can be used with a wide variety of data types, including scalars, other vector types, and quaternions. My linterp() function template replaces calls to the author's vec3_lerp() function. By having a single function template the code benefits from function overloading; the programmer only has to remember a single function name when a linear interpolation is needed, regardless of the data type being interpolated.

The second interpolation type is called Spherical Linear intERPolation, and is commonly referred to as SLERP. Ordinary linear interpolation calculates a point on the line segment between two points in an arbitrary N‑dimensional space. SLERP treats two points as being on the surface of a unit sphere (i.e., a sphere which has a radius of one), and calculates a point on the surface of the sphere along the great circle which passes through both end points. ( A link for those of you who don't know what a great circle is.) This is useful for interpolation of rotations/orientations which are expressed as quaternions. You'll remember that earlier in this post I stated that all quaternions which represent rotations have a length of 1. This means that all rotation quaternions lie on the surface of a 4‑dimensional unit sphere. If the program used linear interpolation to calculate an intermediate rotation/orientation between two rotations/orientations the intermediate point would tunnel below the surface of the unit sphere, and, having done so, would no longer be a distance of one from the center of the sphere, and, therefore, not represent a valid rotation quaternion. SLERP calculates only points which are on the surface of the sphere, assuming, of course, that the two end points for the interpolation are themselves on the surface of the unit sphere. There is an additional wrinkle. If two quaternions are on opposite sides of the unit sphere anomalies show up in the interpolation. These anomalies can be averted by exploiting the fact that the quaternions q and ‑q represent the same rotation. The code determines if two quaternions are on opposite side of the sphere by treating the quaternions as 4‑dimensional vectors and computing their dot product. If the dot product is negative the two quaternions are on opposite sides of the sphere so the slerp() function template negates the second quaternion and uses the negated value in the actual SLERP calculation. More details of the mathematics behind spherical linear interpolation can be found in the books listed in the bibliography below. The slerp() function template replaces calls to the vec4_slerp() function. Before moving on I should mention that I implemented a SLERP function using a function template rather than a ordinary function because in other graphics programs there are times other vector quantities need to be interpolated using this method so a function template saves me from maintaining multiple versions of the function.

The last interpolation type found in the code is a cheat. [Note: Calling the code a cheat here isn't a criticism. Everything in 3D graphics is a cheat. The criterion for whether it's a good cheat or a bad cheat is whether, or not, the calculation gives acceptable results.] It purports to perform linear interpolation on vec4 quantities but the function vec4_lerp() is really a poor man's SLERP routine. Before explaining how the code cheats I should point out that, IMHO, the function is misnamed because the values it's operating on are really quaternions. The function has the same safeguard against interpolating between quaternions on opposite sides of the unit sphere as described above in the discussion of SLERP. But afterwards, the routine uses true linear interpolation to calculate an intermediate point, which, as also described above, means that the function can generate intermediate points which are below the surface of the unit sphere. As "cheats" go its value is that it avoids expensive calculations of transcendental functions (sine, and arctangent); my real criticism of the code as an optimization is that, IMHO, after the code calculates the intermediate point, it should normalize the length of the vector so that it's a proper rotation quaternion. The author makes up for this in other places in the code but it's my opinion the issue should really be dealt with in the routine. By compensating for the length of the interpolated value in other places in the code, the code unnecessarily normalizes quaternions which were generated by the SLERP routine. To replace this portion of the author's code I created a new template function which I call rinterp(), meaning it's a linear interpolation for rotations. I don't know that this function template will ever be needed for anything other than quaternions, so I wrote it to only support types created from my class template qTemplate. If you find it's of value for dealing with other vector types feel free to re-write it to suit your purposes.

As mentioned previously, I've changed many of the GFX module routines to use parameters passed by reference rather than by pointer. There is one other set of changes to the GFX module which I've started in this iteration of the code (and will continue in a future post) which needs to be introduced. The original versions of the functions GFX_translate(), and GFX_scale() each take three scalar arguments, which the functions use to construct a vec3 variable. The vec3 variable is then passed as an argument to mat4_translate(), or mat4_scale() to do the actual work of GFX_translate(), or GFX_scale() respectively. The irony of this implementation is that the application code which calls either GFX_translate(), or GFX_scale() often has a vec3 variable which it breaks up and passes to the GFX_translate()/GFX_scale() functions as individual scalars. For example, in the original version of the sample program chapter4‑1 the code makes the call

GFX_translate(objmesh‑>location.x,
objmesh‑>location.y,
objmesh‑>location.z);

Instead of breaking the vec3 variable into scalars it would be more convenient to pass the objmesh‑>location variable whole. To do this I've added new versions of GFX_translate() and GFX_scale() which take a vec3 variable as an argument. Using this new version of GFX_translate() the call in the program chapter4‑1 becomes:

GFX_translate(objmesh‑>location);

The old interfaces to GFX_translate(), and GFX_scale() still exist but they now call the new versions of GFX_translate(), and GFX_scale() which take a single vec3 parameter.

For more information about spherical linear interpolation (SLERP) see the follow books:

This wraps up the introduction of my linear algebra and quaternion tools. Again, I hope that you'll agree that they make the code easier to write correctly by simplifying the programming of various math equations, and that by also making the equations simpler, that they're also easier to read.

In the next post I'll introduce tools to replace the rest of the author's matrix tools, and, finally, begin a serious attempt to upgrade the author's GFX module to make it more efficient.