Reverse engineering 3D Movie Maker - Part 1

A while ago, I started reverse engineering Microsoft 3D Movie Maker to understand how it works and to develop my game reversing skills. This blog series is about my adventures in reversing 3D Movie Maker and some of the interesting things I learnt along the way.

Introducing 3D Movie Maker

Microsoft 3D Movie Maker logo

Microsoft 3D Movie Maker (3DMM) is a creativity game released under the Microsoft Home product line in 1995. It is pretty much as the name describes: an application that lets you create your own animated 3D movies. There were a few of these kinds of creativity games released in the 90s, such as Theatrix’s Hollywood series. 3DMM was unique in that you could create real 3D movies instead of animating pre-rendered content. The closest modern equivalents to 3DMM are tools like Source Filmmaker and sandbox games like Garry’s Mod.

Here’s a demo of the game from the YouTuber TechzoneTV:

Twenty-five years after the game was released, there is still an active online community of 3DMM enthusiasts who make new movies and mods for the game.

Previous reverse engineering of 3DMM was motivated by the limitations of the game. 3DMM gives you a fixed set of scenes, camera angles, actors and props. In the early 2000s a number of 3DMM enthusiasts including Foone Turing, Frank Weindel and others worked on reversing the file format and the 3D engine to figure out how to add custom 3D models into the game. Tools such as 3DMM Pencil were developed to explore and modify the contents of the game’s data files. Eventually the parts of the file format that managed 3D models were understood, leading to the development of the 7gen tool for converting models from standard 3D object file formats (eg. OBJ) into the custom format used by 3DMM. There is also a cool project called Open3dmm which is working towards a complete rewrite of the game in C# including a new 3D rendering engine using OpenGL.

I was less interested in working around limitations in the game and more interested in how the game engine works. This is the area that piqued my curiosity and where I started reverse engineering.

Why am I reverse engineering 3DMM over any other obscure 90’s CD-ROM game? I’ve always been curious to know how it works. I remember playing a demo of 3DMM when I was eight years old and thinking it was the best thing I had ever seen. It was 3DMM that sparked my initial interest in computers, which eventually led to my career as a software developer.

The game itself

3D Movie Maker consists of a single executable file, 3DMOVIE.EXE, a set of CHK files, and some AVI files used for cutscenes.

Unlike many other games released in 1995, 3D Movie Maker was designed to run exclusively on Windows 95 and hence is a 32-bit Portable Executable file. This is good news for reverse engineering: all of the modern reverse engineering tools support Portable Executable files. There is no DRM or obfuscation on the binary which would make it more difficult to reverse. Another cool feature is that the use of software rendering instead of Glide or some early version of Direct3D means we can run and debug 3DMM natively on a modern Windows 10 machine with no major compatibility issues. I started my static reverse engineering using IDA but moved to Ghidra when it was released in 2019. I have complimented the static analysis with dynamic analysis using the WinDbg debugger.

The CHK files are 3DMM’s custom file format: “chunky” files1. These files are effectively a serialized graph of objects. Each chunk in the file has a type tag to identify how to deserialize the object, a chunk number to identify it, and references to other chunks. For example, a CAM chunk (camera angle) references a MBMP chunk (pre-rendered scene bitmap) and a ZBMP chunk (depth buffer bitmap). The structure of this file format was reverse engineered a long time ago, but only some of the chunk types have been documented.

Using 3DMM Pencil to display scene backgrounds in BKGDS.3CN

Initial static analysis

It appears that there are three different projects statically linked into the 3DMM binary. A significant amount of the code is the 3D rendering engine, BRender from Argonaut Technologies. The engine is written in a combination of C and x86 assembly language. How do I know this? There are a bunch of strings in the binary that contain filenames and version information from a version control system. Similar strings can be found in other games that use the BRender engine.

BRender version control strings in the 3DMM executable

The rest of the binary is the game engine, which handles all of the 2D graphics, animation and user input. The engine is written in C++, which is evident from the extensive use of objects with virtual function tables (vtables). Unfortunately the use of virtual functions makes static reverse engineering difficult. Instead of static references to functions, calls to virtual functions must be resolved at runtime. This means you can’t statically locate cross references between the virtual functions in the binary. There’s no easy generic solution to this problem: I worked around it by spending a lot of time in WinDbg tracing function calls and call stacks to find which functions were called by other functions.

Interestingly, there is a section of the game engine that has a completely different coding style: the audio engine. This engine is referred to internally as “AudioMan”. It is also written in C++ but unlike the rest of the game the AudioMan code uses COM programming patterns. I found some COM interface ID (IID) constants in the binary and followed cross-references to them to find functions that looked like QueryInterface functions. Each QueryInterface function will compare a given IID against the supported IIDs for a class, and if the requested IID is found will increment the reference count and return a pointer to itself. Here is a very rough pseudo-C++ example of what one of these functions should look like:

HRESULT MyClass::QueryInterface(REFIID riid, void** ppvObject)
    // Check we have a pointer to receive the result
    if (ppvObject == nullptr)
        return E_POINTER
    // Check if the requested interface is IUnknown
    // or any other supported interfaces
    if (riid == IUnknown || riid == ISomeCustomInterface)
        this->AddRef();             // increment reference count
        *ppvObject = this;          // return the pointer to this object    
        return E_NOINTERFACE;       // not supported

These functions can be used to determine which interfaces are supported by the class. If the interface IDs happen to be standard COM interfaces, that’s great as now you can use the standard vtables to give names and parameters to virtual functions in the class! Unfortunately there weren’t many standard COM interfaces used other than IUnknown (which everything uses), but I did find some references to the IStream interface. The IStream interface provides stream I/O functions. I followed the cross-references to the IStream IID to find a custom stream class that acts as the interface between AudioMan and the game engine’s chunky data files.

Ghidra displaying decompilation of the QueryInterface function in a custom stream class

Some further Googling found that the AudioMan engine was used in a number of products from Microsoft’s Interactive Media Division, including the Encarta encyclopedia. One component of Encarta 97 includes a DLL called am16.dll which exports a number of functions that are similar to functions found in 3DMM. This was a cool find as the export table of the DLL gave me names for some of the AudioMan functions. I was able to port some of these names into my IDA database using BinDiff.

Recovering object type information

Most of the game engine classes inherit from a base class that provides reference counting and a custom runtime type identification system. Each class has a type tag that can be used to identify the class at runtime. The base class vtable contains five functions, which I have called:

The GetType function is useful for identifying classes and mapping classes to vtables. Type tags are 32-bit integers that also happen to contain ASCII characters. Here’s an example of a GetType function for one of the base classes:

mov eax, 0x45534142     ; "BASE"

Some of the class tags are descriptive: for example “BASE” for base classes, “RND” for a random number generator class, “APP” for the main application class. Some are less obvious, like a “GGCR” object is a colour palette (something something ColouR?). There are also patterns, for example a “GL” is a generic (?) list object. Chunk types in the data files that start with GL are lists of things, eg. GLPI is a list of indexes of parts that make up a 3D model.

The most interesting function in the base class is the CheckType function. This function takes a type tag and compares it to this object’s type tag. If it isn’t equal, it will call the parent’s CheckType function. This can be used to infer the class inheritance hierarchy by identifying the parent of each class. Here is a decompiled example of the CheckType function from the SFL (shuffler) class, which is a subclass of the random number generator class (RND):

BOOL SFL::CheckType(uint32_t type_tag)
    return (type_tag == "SFL") ? TRUE : RND::CheckType(type_tag)

BOOL RND::CheckType(uint32_t type_tag)
    return (type_tag == "RND") ? TRUE : BASE::CheckType(type_tag)

BOOL BASE::CheckType(uint32_t type_tag)
    return (type_tag == "BASE");

I wrote a Ghidra script to automatically recover the class hierarchy from the binary. My script first finds the GetType functions by searching the .text section for a byte sequence that matches the instructions MOV EAX, <value>; RET. As this function is so small, it doesn’t have the usual function prologue or epilogue so it is often not identified as a function. The script then determines which of these “return constant value” functions returns a valid type tag. Once it has a list of valid GetType functions, it will then:

At this point, I could almost automatically recover the class hierarchy. Unfortunately the script found that none of the classes were derived from the base class. It turns out that there are three different base classes, each with the same “BASE” type tag. This broke the script as it couldn’t differentiate between the different base classes. I worked around this by looking at cross-references to the destructor for each base class. From this I was able to infer that almost all of the classes use one of the base classes, and only a few random classes use the other two base classes. I’m not sure why this is, but might look into this at a later date.

Another problem I had was that my script found a single class that derived from a class that didn’t exist in the binary. I suspect this was some kind of interface type that never got linked into the binary. It’s a bit dodgy, but I solved this problem by adding a special case in my script that looked for this specific class and set the parent to the parent’s parent.

Once I had these problems solved, I could automatically recover the class hierarchy. I updated my script to produce a cool GraphViz graph of how all of the classes fit together. Unfortunately the graph is huge because there are a lot of classes, so here is a small part of it:

Class hierarchy for the Theatre and Studio classes

In this example, the Studio (STIO) and Theatre (TATR) are both kinds of game objects (GOBs). The GOB class inherits from the CMH class which represents an entity that can receive messages from other entities. Finally, the CMH class inherits from the BASE class which handles reference counting and type checking. I guess I could have made my script produce a UML class diagram, but I’m a 90’s game reverse engineer, not a 90’s enterprise software engineer. :)

Now that I had the class hierarchy, I could start adding some more automatic analysis features to my script. Some of the features I added include:

The result looks great in Ghidra: I can easily jump to a class’s vtable or virtual functions by finding the class namespace in the Symbol Tree.

Ghidra displaying the vtable for the ESLR class after running the script

Creating vtable structures makes it easier to identify the usage of virtual functions. As long as type information is set correctly, Ghidra can identify the use of virtual functions across your program. One example where this is useful is the global error handler object. The error handler class (ERS) has a function which is called whenever an error occurs. The error handler looks up the message for the error ID and displays a modal dialog box. Without any type information, the call to the error handler looks like any other virtual function call:

Call to the global error handler before applying type information

… but applying the ERS type to the global and renaming the entry in the vtable struct makes it much more readable:

Call to the global error handler after applying type information

Now I can use Ghidra’s Find References function to find all of the calls to the RaiseError function, which shows me everywhere in the binary where something can fail. It even shows the error code for each occurance!

Find References showing all of the calls to the error handling function

I have pushed my Ghidra script for recovering the class hierarchy to Github. I have tested the script with several releases of 3D Movie Maker and it works for me, but your mileage may vary. The script can also recover class information from Microsoft Creative Writer 2, which shares the same game engine. That said, it is currently broken for CW2 as it looks like it was compiled with a newer compiler and/or different compile options that change the destructor code just enough to break my heuristics. I’ll fix that one day.

Next time: Game engine internals and finding Easter eggs.

  1. The name “chunky” files comes from an error message in the beta release of 3DMM. I guess they are called chunky files because they contain chunks. ↩︎

comments powered by Disqus