Wednesday, December 11, 2013

Remaking MD5 into a Class

This post continues my series about rewriting the code from Game and Graphics Programming for iOS and Android with OpenGL ES 2.0 to better leverage the abilities of C++. The MD5 module is used to perform animation of a figure.  The header file defines the data types
  • MD5JOINT,
  • MD5VERTEX,
  • MD5TRIANGLE,
  • MD5WEIGHT,
  • MD5MESH,
  • MD5ACTION, and
  • MD5
To get a copy of the code as described in this post pull a copy from GitHub using the tag chapter12.

The MD5 type uses the (by now overly) familiar pattern of an integer and a pointer; the pointer points to dynamically allocated space for an array of objects and the integer is the count of the number of objects in that array.  Similar to previous cases
  • n_joint and bind_pose are replaced by std::vector<MD5JOINT> bind_pose,
  • n_mesh and md5mesh are replaced by std::vector<MD5MESH> md5mesh, and
  • n_action and md5action are replaced by std::vector<MD5ACTION> md5action.
In the MD5ACTION type
  • n_frame and frame are replaced by std::vector<std::vector<MD5JOINT> > frame, and
  • pose is replaced by std::vector<MD5JOINT> pose.
There are two things which are different about MD5ACTION than what I've presented before.  The first is that the old pose data member wasn't directly paired with an integer.  In its case it reuses the integer n_joint from the MD5 class which in my modified version of the code becomes MD5::bind_pose.size().  The second difference is that frame is an array of arrays.  Until I start using C++ 11, note that a space character is required between the two ">" characters in the declaration.  Prior to C++ 11 the space was required to disambiguate between adjacent right angle brackets used for templates and the right shift operator (>>).

In the MD5MESH type
  • n_vertex and md5vertex are replaced by std::vector<MD5VERTEX> md5vertex,
  • n_triangle and md5triangle are replaced by std::vector<MD5TRIANGLE> md5triangle, and
  • n_weight and md5weight are replaced by std::vector<MD5WEIGHT> md5weight.
The old MD5_load_mesh() function is the basis for the MD5 class constructor.  In a break with past posts MD5_free() isn't used at all to construct the MD5 class destructor.  The old MD5_free() function is:

    MD5 *MD5_free( MD5 *md5 )
    {
        unsigned int i = 0,
                     j;

        MD5_free_mesh_data( md5 );

        while( i != md5->n_mesh )
        {
            MD5MESH *md5mesh = &md5->md5mesh[ i ];
                
            if( md5mesh->md5vertex ) free( md5mesh->md5vertex );
                
            if( md5mesh->vertex_data )
                free( md5mesh->vertex_data );
                
            if( md5mesh->md5triangle )
                free( md5mesh->md5triangle );
                
            if( md5mesh->md5weight )
                free( md5mesh->md5weight );

            if( md5mesh->vbo )
                glDeleteBuffers( 1, &md5mesh->vbo );

            if( md5mesh->vbo_indice )
                glDeleteBuffers( 1, &md5mesh->vbo_indice );
                
            if( md5mesh->vao )
                glDeleteVertexArraysOES( 1, &md5mesh->vao );
                
            ++i;
        }

        if( md5->md5mesh ) free( md5->md5mesh );
        

        i = 0;
        while( i != md5->n_action )
        {
            MD5ACTION *md5action = &md5->md5action[ i ];
                
            j = 0;
            while( j != md5action->n_frame )
            {
                free( md5action->frame[ j ] );
                ++j;
            }
                
            if( md5action->frame ) free( md5action->frame );
                
            if( md5action->pose ) free( md5action->pose );

            ++i;
        }
        
        if( md5->md5action ) free( md5->md5action );

        
        if( md5->bind_pose )
        {
            free( md5->bind_pose );
            md5->bind_pose = NULL;
        }
        

        free( md5 );
        return NULL;
    }

I gutted the whole thing.  If you look in the file SDK/common/md5.h you'll see that the entire definition of the MD5 destructor is:

    ~MD5() {}

If you find it unsettling for the MD5 class, which has data members that dynamically allocate memory to have a destructor which, seemingly, does nothing, please know that I find it a little unsettling too.  If it makes you feel better you could always rewrite the destructor as:

    ~MD5() {
        bind_pose.clear();  // Analogous to free(bind_pose);

        md5action.clear();  // Analogous to free(md5action);

        md5mesh.clear();    // Analogous to free(md5mesh);
    }


You can see that above in the listing for the old MD5_free() function I've color coded various parts.  The red text, or what is left of it in my version of the code, now appears in the destructor for the MD5MESH class.  The green text has been removed from the code, entirely. 

The cosmetic difference between the two versions of the MD5 class destructor is that in the second version the programmer explicitly states which MD5 class resources need to be deconstructed before the code deconstructs the MD5 class instance, itself.  In the first (empty) version of the MD5 class destructor the compiler automatically generates code to release the space dynamically allocated to these vectors.  In both cases, the destructor method for the std::vector template includes logic to iterate over each of the elements in the std::vector instances and invoke that object's class destructor on that element.  (This is why the red text has been removed from the MD5 destructor, and placed into the destructor for the MD5MESH class.)

Take a second look at the red text above.  The code to release the space for the md5vertex, md5triangle, and md5weight data members is, likewise, missing, and unneeded, in the implementation of the MD5MESH destructor for the same reason.  Because these three data members are defined as std::vectors the compiler automatically generates code which iterates over each of the vectors' elements, and invokes the appropriate class destructor to clean up the element just prior to releasing that element's space back to the heap.  The only red text code which survives in the MD5MESH destructor is the code which the compiler won't generate automatically.  The vertex_data data member of the MD5MESH class code had its space allocated "manually" (using malloc(), calloc(), and/or realloc()) so the destructor must "manually" release the space (by calling the free() function).  Likewise, the MD5MESH data members vbo, vbo_indice, and vao hold data resources which are created by OpenGL and the destructor has to use OpenGL to release those resources so OpenGL can reuse them.

I don't (yet) need to write destructors for either the MD5ACTION, or MD5JOINT classes; the compiler is able to successfully generate their destructors automatically so I rely on the compiler to do that work for me.  Though, if like having an empty MD5 class destructor, this makes you feel unsettled, by all means, feel free to add explicit definitions, and implementations of these two destructors.

I need to back up here to explicitly state the reason why I think that the MD5 destructor is better than the old MD5_free() function.  That the old MD5_free() function had the knowledge to free the resources of the old MD5MESH, and MD5ACTION data types violates good encapsulation procedures.  The implementation details of those data structures should be opaque to the code which uses them; those classes (such as MD5MESH) should only expose what is needed for other code to be able to use them.  It is appropriate that since the MD5 object owns MD5MESH, MD5ACTION, and MD5JOINT objects that it be responsible for making sure that they are destructed.  However, the appropriate way for the MD5 object to fulfill this obligation is to send a message to each of the objects it owns telling them that it is time for them to destruct themselves not for the MD5 object to perform that task itself.  For example, if the old code was some thing like

    for (int i=0; i!=md5->n_mesh; ++i)
        MD5MESH_free(&md5->md5mesh[i]);

    if (md5->md5mesh) free(md5->md5mesh);

    for (int i=0; i!=md5->n_action; ++i)
        MD5ACTION_free(&md5->md5action[i]);

    if (md5->md5action) free(md5-> md5action);

then I think this code is consistent with good data encapsulation practices.  If the definition of the MD5MESH data type changes, those changes should (IMHO) largely be opaque to classes which use MD5MESH class objects.  Similarly for the code which exposes details about the MD5ACTION data structure to the MD5_free() function.  Note that no details of the MD5JOINT data structure are exposed in the old MD5_free() function; the function simply frees the resources which the MD5 "class" code allocated.

Take a look at the copy constructor and the assignment operator for the MD5MESH class.  You'll see that I've opted to do a "deep" copy of the data.  That is, I don't simply copy the contents of the pointer vertex_data in the source object into the destination object.  I allocate new space for the destination object to point to, and then copy the contents of the dynamically allocated space pointed to by the object being copied into the newly allocated space pointed to by the destination object.  In the assignment operator before doing any of this I first test to see if the destination object already points at dynamically allocated space, and free the old space, if needed.  (Look at the code; it's much easier to do than this paragraph might lead one to believe.  )

Another thing you may have noticed about the changes in the header is that I added initializer lists to the data members which have array types.  Likewise, I went back through the various modules that I had previously rewritten and added initializer lists to those classes too.  One reason for doing this is that I can't initialize those data members in the initializer lists for the constructors; to initialize the array data members in the initializer list which is part of the constructors the array data members would have to be classes and then I could use the constructors for those classes to initialize the data members in the initializer lists of the constructors.  In most cases I've made all of the values for the initializer lists for the array data members to be all zeroes. (The offset data member of the OBJMESH class is one exception.)  You'll note that for the rotation data member of the MD5JOINT class I used the initializer list "{ 0, 0, 0, 1 }".  If the data field were really a vec4 instance I would have used all zeroes.  But (IMHO) it's not really a vec4, at all.  The details of why I think the data member rotation is incorrectly declared to be a vec4 data structure will be covered in a future post.

Back to the definition of the MD5 class.  I didn't like the signatures of the of the old MD5_set_pose(), MD5_blend_pose(), and MD5_add_pose() functions.  For example, when the old MD5_set_pose() function is called a pointer to an MD5JOINT object is passed in.  What does this pointer represent?
  • Does it mean that the programmer decided to pass in the address of the object because if he passed the object by value the compiler would have generated code to make a copy of the object and that temporary object would have been passed on the stack to the function, and the programmer considered that inefficient?
  • Did the programmer pass the address of the object so that the function could modify the contents of the object?
  • Did the programmer pass the address of the object as a C idiom for passing an entire array of such objects for the function to access?
  • If the programmer is passing in an array does he also expect that the function will modify the contents of that array?
Without examining the body of the MD5_set_pose() function it isn't immediately obvious what the programmer intended.  The same questions apply to the old MD5_blend_pose(), and MD5_add_pose() functions' parameter lists.  Consequently, I modified the signatures of the MD5::set_pose(), MD5::blend_pose(), and MD5::add_pose() classes methods to clarify the roles of the various parameters:

    void set_pose(std::vector<MD5JOINT> &pose);
    void blend_pose(std::vector<MD5JOINT> &final_pose,
                    const std::vector<MD5JOINT> &pose0,
                    const std::vector<MD5JOINT> &pose1,
                    const  MD5Method joint_interpolation_method,
                    const float blend);
    void add_pose(std::vector<MD5JOINT> &final_pose,
                  const MD5ACTION &action0,
                  const MD5ACTION &action1,
                  const MD5Method joint_interpolation_method,
                  const float action_weight);

Now it should be obvious that the pose argument passed into MD5::set_pose() can be (and, in fact, is) modified by the MD5::set_pose() class method, and it's also clear that pose is a collection of objects (in this case an instance of std::vector) being passed into the method.  Similarly, it's clear that only the final_pose argument to MD5::blend_pose(), and MD5::add_pose() can possibly be an output of these class methods; all of the other arguments to these two class methods are only inputs (their declarations include the const qualifier), and it's clear that the arguments final_pose, pose0, pose1, action0, and action1 are collections of objects, not simply the address of a single object being passed into the class methods to avoid passing a temporary copy of the argument on the stack.

However, declaring that pose0, pose1, action0, and action1 are read-only (const) causes a ripple effect.  Various data members from these object instances are passed to the functions vec3_lerp(), vec4_lerp(), and vec4_slerp().  The signatures for the original versions of these functions doesn't specify that the second, third, and fourth arguments are read-only (const), though, in fact, they are.  I needed to modify the declaration, and implementation of these functions in the files SDK/common/vector.h, and SDK/common/vector.cpp, respectively.  Likewise, because vec4_lerp(), and vec4_slerp() invoke vec4_dot_vec4(), the signature for this function needs to be changed, as well.  One might guess that other functions in these two files need the same treatment.  They do, but I'm not going to make any more changes to the code in these two files for the purposes of this post because those changes would only be lost in the revisions I already have planned for these two files.

You may have noticed that in the new signatures for the MD5::blend_pose(), and MD5::add_pose() class methods that one of the parameters has the type MD5Method.  The old version of the header file declared

    enum
    {
        MD5_METHOD_FRAME = 0,
        MD5_METHOD_LERP  = 1,
        MD5_METHOD_SLERP = 2
    };

which is an anonymous enumeration type.  By giving the enumeration type a name, and then using that name to declare the type of the joint_interpolation_method parameter it greatly (IMHO) improves the readability of the output from the Xcode debugger.  When the values of variables of type MD5Method are displayed by the debugger, the debugger will show the symbolic values MD5_METHOD_FRAMEMD5_METHOD_LERP, and MD5_METHOD_SLERP, rather than the values 0, 1, and 2, respectively.  The anonymous enumeration which defines STOPPLAY, and PAUSE in the file SDK/common/types.h needs to have a name for the same reason.

Creating a default constructor for the MD5MESH class allowed me to simplify the following code from the old MD5_load_mesh() function:

    unsigned int i = 0;

    md5->md5mesh =
        ( MD5MESH * ) calloc( md5->n_mesh, sizeof( MD5MESH ) );

    while( i != md5->n_mesh )
    {
        md5->md5mesh[ i ].mode = GL_TRIANGLES;
        md5->md5mesh[ i ].visible = 1;
        ++i;
    }

by replacing it with the code

    this->md5mesh.resize(n_mesh);

in the MD5 constructor.  When I defined the default constructor for the MD5MESH class I specified in the initializer list for the constructor that the default values for the mode, and visible data members are GL_TRIANGLE, and true, respectively.  (Yes, the visible data member is another of those pesky cases where unsigned char is being used in place of a bool.)

Again, in the old MD5_load_action() function the code:

    MD5ACTION *md5action;

    ++md5->n_action;

    md5->md5action =
        ( MD5ACTION * ) realloc( md5->md5action,
        md5->n_action * sizeof( MD5ACTION ) );

    md5action = &md5->md5action[ md5->n_action - 1 ];

    memset( md5action, 0, sizeof( MD5ACTION ) );

    strcpy( md5action->name, name );

    md5action->curr_frame = 0;
    md5action->next_frame = 1;

becomes the following code:

    MD5ACTION *md5adction;

    this->md5action.push_back(MD5ACTION(name));

    md5action = &this->md5action.back();

in the MD5::load_action() class method.  The need to initialize the name, curr_frame, and next_frame data members separately, from the action of allocating the space for the new array element can be eliminated because of the use of a constructor.

There are probably other examples I could cite but, at the very least, this brief explanation should help you be able to find them should you be interested in the details.

I modified the MD5::get_action(), and MD5::get_mesh() methods by moving loop invariant code outside of the loop.  This has been discussed previously, so I won't repeat the details here.

The old function MD5_build_vbo(MD5 *md5, unsigned int mesh_index) is, IMHO, another case of a function which is part of the interface for the wrong class.  (For other examples see this post.)  The parameter mesh_index is used to select which MD5MESH object owned by the MD5 object the function is going to operate on.  In my modified version of the code the method is part of the MD5MESH class.  Note that making this method part of the MD5MESH class definition is consistent with the way I've implemented the MD5MESH class destructor; many of the resources which need to be freed in the MD5MESH class destructor (vertex_data, vbo, and vbo_indice) were already created here by the old MD5_build_vbo() function.  Also, the old MD5_build() function which formed the basis of the MD5::build() class method looped over all of the MD5MESH objects, and allocated their vao resource.  IMHO, this again violates encapsulation.  Consequently, I have moved the following lines from that loop:

    glGenVertexArraysOES( 1, &md5mesh->vao );

    glBindVertexArrayOES( md5mesh->vao );

    MD5_set_mesh_attributes( md5mesh );

    glBindVertexArrayOES( 0 );

into a new MD5MESH::build_vao() method.  This means that MD5MESH now completely owns the creation/allocation, and the destruction/release of its own resources.

Why not just move the code into the existing MD5MESH::build_vbo() method?  This would, unfortunately, introduce a bug.  The MD5::build2() method doesn't allocate a vao resource for the each of the meshes but it does invoke MD5MESH::build_vbo() on each of its MD5MESH objects.  By putting the allocation of the vao resource into a separate class method MD5::build() can invoke the method on each of its MD5MESH objects, and MD5::build2() can continue to not build a vao resource by not invoking the method.

Scattered throughout the original version of the code are statements such as:

    memcpy(&this->md5mesh[mesh_index].md5vertex[int_val],
           &md5vertex,
           sizeof(MD5VERTEX));

These calls to memcpy() are consistent with my claim that the module, as written, isn't really C++ code but C code which can be compiled with a C++ compiler.  In effect, what this call to memcpy() really does is assign the contents of the object md5vertex to this‑>md5mesh[mesh_index].md5vertex[int_val]; in C this is a very common way of performing this task for data types created by the programmer because C doesn't support operator overloading.   In C++ one would normally define an assignment operator for the MD5VERTEX class and use that to write the above statement as:

    this->md5mesh[mesh_index].md5vertex[int_val] = md5vertex;

And, if you look at my changes to the MD5VERTEX class you'll quickly see that I did exactly that.  But there are a number of other data types for which I haven't yet written an assignment operator.  For example, in the old function MD5_set_pose() there are the two lines:

    uv_array[j].x = md5vertex.uv.x;
    uv_array[j].y = md5vertex.uv.y;

uv_array is a pointer to an array of texture coordinates, and that array is declared to be an of type vec2 elements.  Likewise, md5vertex.uv is declared to have type vec2.  No where have I, or the author, defined an assignment operator for the vec2 type.  Yet I was able to successfully rewrite those two lines of code as:

    uv_array[j] = mv5vertex.uv;

The C++ compiler silently created a default assignment operator for the vec2 type.  Using this fact I have searched throughout the code for places where memcpy() has been used as a substitute for the assignment operator, and, where possible, replaced the call to memcpy() with simple assignment statements.  IMHO this makes the code much easier to read.

I've removed many, perhaps even most, of the uses of memcpy(), and, using initializer lists as mentioned above, I was also able to remove many calls to memset().  It's my opinion that using memcpy(), and memset() unnecessarily just makes the code harder to read.

Other places in the code you can find where the compiler has also quietly created default copy constructors for the data types vec3, and vec4.  For example, in the class method MD5::build_bind_pose_weighted_normals_tangents() you'll see that I changed the declaration

    vec3    normal  = { md5vertex->normal.x,
                        md5vertex->normal.y,
                        md5vertex->normal.z },
            tangent = { md5vertex->tangent.x,
                        md5vertex->tangent.y,
                        md5vertex->tangent.z };

to

    vec3    normal(md5vertex->normal),
            tangent(md5vertex->tangent);

which uses the default copy constructor for vec3.  Again, it may be a matter of opinion, but I think the second version is easier to read.  And even if it's equally difficult to read it's less error prone to type the second version; fewer characters to type means fewer opportunities to screw up.

This concludes my first pass through the modules presented in the book to facilitate game development for iOS, and Android.  In this first pass I've tried to focus on making the existing code better, rather than rewriting entire modules from scratch.  I'm sure there are still a lot of places where encapsulation could be improved.  To that end I'm tempted to go through each of the class types, and redeclare, one by one, the various data members as private to see how many places those data members are being accessed outside of the class which owns them, and then decide if they should really be visible outside of the class.  For the time being I have other code improvements I want to address.  Remember that there are still two of the author's modules which I haven't rewritten, the GFX, and AUDIO modules.  In my next post I'll begin putting into place the tools I need to help me restructure the GFX module.

Thursday, December 5, 2013

Implementing Multiple Types of Lights

This post continues my series about rewriting the code from Game and Graphics Programming for iOS and Android with OpenGL ES 2.0 to better leverage C++ abilities.  Finally, in Chapter 10 I get to write some code which uses real object oriented polymorphism.  This module from the book was my favorite to rewrite.  A lot of conditional logic which used to be exposed in the application code has been hidden from the application programmer inside this class hierarchy which implements lighting.

To get a copy of the code as described in this post pull a copy from GitHub using the tag chapter10.

I need to start by describing how the original code implements the LAMP type to support multiple types of light sources.  The author implements five different types of lighting in this chapter.  Each of these types requires a different set of data members be passed to the OpenGL programmable shaders.  Along with the various lighting parameters needed to implement each light source type the data structure has a data member called type which holds an integer value 0 through 4.  One of the first things I did was create a new enumerated data type (LampType) to give each of these integer constants a symbolic name.  You can find this new type near the top of the file SDK/chapter10-1/templateApp.cpp.

The author's final data structure holding all of the values needed for each of the light types is:

    typedef struct {
        char    name[MAX_CHAR];
        vec4    color;
        vec3    direction;
        vec4    position;   // Position of the lamp in world
                            // coordinates
        float   linear_attenuation;
        float   quadratic_attenuation;
        float   distance;
        /* The cosine of half the field of view of the spot
         * (in radians).
         */
       float   spot_cos_cutoff;
        /* Factor ranging from 0 to 1 to smooth the edge of
         * the spot circle.
         */
        float   spot_blend;
        /* The spot direction is calculated by multiplying
         * the direction vector by the invert of the
         * modelview matrix of the camera.
         */
        vec3    spot_direction;
        unsigned char type;
    } LAMP;

As you can see this data structure has eleven data members.  The code never sends the name datum to the OpenGL shaders.  The code always* sends the type data member because the OpenGL shaders need that information to know how to process the rest of the data members.  The following table shows which data members are needed for each type of light source:


TypeColorDirectionPositionAttenuationSpot
LinearQuadraticDistanceAngleBlendDirection
Directional
Point
Point w/Attentuation
Spherical Point
Spot

* That is the author's code ought to always send the type data member.  See below for details.

The author's original code has a multi-branch if statement which was executed each time the code called the program_draw() function.  The if statement looks like:

    if (lamp->type == 0) {
        .
        .
        .
    } else if (lamp->type == 1) {
        .
        .
        .
    } else if (lamp->type == 2) {
        .
        .
        .
    } else if (lamp->type == 3) {
        .
        .
        .
    } else {    // lamp->type == 4
        .
        .
        .
    }

The original code uses this if statement to only send the data members of the LAMP type which are needed for each particular light source type.

My rewrite of the code using polymorphism accomplishes the same result but moves the burden for managing which data needs to be transferred to the OpenGL shader code onto the LAMP class hierarchy.  LAMP and its subclasses have the following organization:
  • LAMP
    • DirectionalLamp
    • PointLamp
      • AttenuatedPointLamp
      • PointSphereLamp
      • SpotLamp
The parent class LAMP declares a virtual function push_to_shader().  For the LAMP class this function has the responsibility sending its type† and color data members to the OpenGL shader programs.

† This is actually a bug in the code.  The author's original code, and, consequently my code which mimics the behavior of his code, never passes the type value to the OpenGL shaders. Both the author's code and my code should be doing this so that the shader programs can implement conditional logic to dynamically calculate the lighting values for the corresponding light source type.

[Note:  The class hierarchy I've implemented is an artifact of the order in which the book presented the various light source types.  Since the data members of the PointSphereLamp are a subset of the data members of AttenuatedPointLamp, I could have made AttenuatedPointLamp a subclass of PointSphereLamp.  Also, if I had used multiple inheritance I could have re-used the direction data member by making SpotLamp be a subclass of both DirectionalLamp, and PointLamp.  But, to do this I would probably have used virtual inheritance since these two classes (DirectionalLamp, and PointLamp) share a common super class (LAMP) in the class hierarchy.]

Each subclass also implements its own version of the virtual function push_to_shader().  The first responsibility of each of these functions is to invoke the push_to_shader() method of its parent class, and then "push to the OpenGL shader" the data members which are unique to its own class.

This implementation of push_to_shader() as a virtual function means that the entire multi-branch if statement shown above can be replaced with the single line of code:

    lamp->push_to_shader(program);

So, for example, if the code has created a light source using "new SpotLamp(…)" the line of code above would invoke SpotLamp::push_to_shader().  Upon receiving control this method would invoke the corresponding method from its parent class, that is, it invokes PointLamp::push_to_shader().  Likewise, PointLamp::push_to_shader() invokes LAMP::push_to_shader().  LAMP::push_to_shader() sends its data member color to the shader program, and returns control to the method which called it.  In this case that would be PointLamp::push_to_shader() which, upon regaining execution control, pushes its data member position to the shader program, and then returns control SpotLamp::push_to_shader(), etc.  Again, as you can see from this example, the burden for managing which data members get pushed to the OpenGL shader is assumed by the LAMP class, and its subclasses, hopefully, making the job of the application programmer just that much easier.

For the sample programs from Chapters 11, and 12 the class LAMP, and its subclasses will be renamed as LIGHT, DirectionalLight, etc. and their code will be moved into the files SDK/common/light.h, and SDK/common/light.cpp.

No new modules are introduced in Chapter 11 so my next post will cover rewriting the MD5 module used in Chapter 12.