Reflection in C++: The simple implementation of Splinter Cell

In this reprinted #altdevblogaday in-depth piece, Lionhead and Ubisoft veteran Don Williamson shares the simple reflection API developed to replace Unreal's object model in Splinter Cell: Conviction.

Eric Caoili, Blogger

January 6, 2012

18 Min Read

[In this reprinted #altdevblogaday in-depth piece, Lionhead and Ubisoft veteran Don Williamson shares the simple reflection API developed to replace Unreal's object model in Splinter Cell: Conviction.] The first part in my Reflection in C++ series gave a high level overview of many of the possibilities open to you when adding reflection to your games. In this second part I'm going to go into details and cover the system used to aid the rendering engine in Splinter Cell: Conviction (SC5). The motivation for the development of the SC5 engine was a clean break from the past. We were working with a very, very large code base that used Unreal 2.5 with many years of modifications and rewrites. While immensely stable, visually very good looking and a code base you could bet a few million dollars on, it slowed the development of new techniques required to push the SC franchise onto the next generation of consoles (the XBox 360, circa 2005). Compile times were painfully slow, link times were in the order of minutes and it suffered from the classic Unreal issue of requiring huge rebuilds whenever you changed a .uc script file (used to define your interfaces to the editor, among other things – partially solved in Unreal 3). There were many levels of pipeline and engine indirection added to ship titles that were slowing down the iteration and development of new techniques, simultaneously contributing to a lack of runtime performance on the target platform. After a couple of months, we were at a few seconds for compile/link on the PC, sub-1 second single file iteration times, vastly simpler/faster import pipelines, multi-threaded performance that was orders of magnitude faster on the Xbox 360 than the old engine, the ability to live edit C++ rendering code and no compile dependency on Unreal. As a result we lost a few key technologies along the way that would have helped immensely, such as the ability to carve indoor space up using BSP volumes. We also ended up recreating some already existing features such as light association control and skeletal attachments. But the full story is something for another day and Stephen Hill's fantastic GDC 2010 piece on the development of some of the rendering technologies can give a bit more insight (Rendering with Conviction). This post will cover just the reflection API developed to replace Unreal's object model. The implementation was very simple, written in a couple of days, constrained to a single cpp/h file pair and slowly grew as the needs of the engine evolved. It also helped us develop the new engine while the old engine was still active, supporting 50 odd developers and keeping game progress undisturbed. It is my hope that I can demonstrate how simple (and sometimes naive) solutions can help ship great games, as long as you're willing to rub shoulders with some pretty big limitations. An example of such a system is Reflectabit, written a few years ago in my spare time. It's unfinished but contains enough to serve as a tutorial of sorts. This post is from memory as I don't have the code to hand anymore. The Type ID The first task in any reflection API is defining what a "type id" is: how can you reference types in code? For SC5 we needed a type ID that:

Could reference built-in types (int, char, etc) as well as custom class types.
Was unique for each type.
Was persistent between program invocations.
Could be used for serialization.

The simplest of solutions is to use an enum:

  enum TypeID
  {
    TYPE_INT,
    TYPE_CHAR,
    TYPE_MYTYPE,
    // ...
  };

Each time you add a new type, you add it to the end of the enum. This needs a means of mapping a C++ type to its enum:

  // Each reflected type must specialize this
  template <typename TYPE>
  inline TypeID GetTypeID()
  {
    // Compile-time assert: Type not implemented (avoiding a return value is enough, really)
  };

  // Specialization examples
  inline template <> TypeID GetTypeID<int>() { return TYPE_INT; }
  inline template <> TypeID GetTypeID<char>() { return TYPE_CHAR; }
  inline template <> TypeID GetTypeID<MyType>() { return TYPE_MYTYPE; }

This is not particularly extensible or maintainable in larger projects: if you're working on a changelist that adds a new type and you are using it for serialization, any incoming changes from members of your own team have the potential to clobber all your data and make the act of submission a chore that is potentially very dangerous. Type IDs will accrue over time and you must ensure that types are added at the end of the list. It's not suitable for a reflection API that ships as part of a 3rd party library that expects its client code to add its own type, although there are many examples of such systems in decades old code (WM_APP). One good benefit of this method is that you can immediately see the type name in a debugger when inspecting values of type TypeID. However, you can't see them at runtime unless you add a means of also mapping the enum value to a string. Of course, there are ways to do this, for example:

  // TypeIDs.inc:

  // List all Type IDs
  TYPEID(TYPE_INT)
  TYPEID(TYPE_CHAR)
  TYPEID(TYPE_MYTYPE)

  // TypeIDs.h:

  // Build the enum table
  #define TYPEID(type) type,
  enum TypeID
  {
    #include "TypeIDs.inc"
  };

  // Use the pre-processor "stringiser" operator to specify the name of each name
  #undef TYPEID
  #define TYPEID(type) #type,
  const char* g_TypeNames =
  {
    #include "TypeIDs.inc"
  };

This is a well-used technique in many shipping C/C++ products where its variants have been branded X Macros. At this point it's all getting a bit messy/overkill; we ruled it out on SC5 without much thought as there was a much simpler solution:

  // Each reflected type must specialize this
  template <typename TYPE>
  const char* GetTypeName()
  {
    // Compile-time assert: Type not implemented
  }

  // Specialization examples
  inline template <> const char* GetTypeName<int>() { return "int"; }
  inline template <> const char* GetTypeName<char>() { return "char"; }
  inline template <> const char* GetTypeName<MyType>() { return "MyType"; }

  template <typename TYPE>
  u32 GetTypeID()
  {
    // Calculates the string hash once and then caches it for further use
    static int type_id = CalcStringHash(GetTypeName<TYPE>());
    return type_id;
  };

As long as your hash function is good this is a great way of defining your type ID support:

Practically, you are guaranteed unique, persistent IDs as long as your typenames are unique (you can prefix namespaces if you like).
Types can be added in isolation; all you need to do is implement your GetTypeName alongside its type in a header file.
The type names are readily available in the debugger and are part of your executable.

The choice of hash function is important but don't overthink the issue. SC5 used CRC32 for both type names and object names but this is technically not the purpose of CRC32 (it's a trivial method of error detection in data packets). We used it for hashing of both type and object names and while I was working on the project we had one hash collision with some materials that was easily sorted with a rename (collisions were tracked offline in a MySQL database). These days I use MurmurHash3, however DJB2 is a nice simple implementation and the field is steadily progressing (e.g. see CityHash). Whatever you choose, collision visibility is essential as collisions can cause very subtle data integrity issues. Before going any further, a small review of some other methods of generating a type ID may be of interest. The first is to use C++ RTTI to replace GetTypeName:

  template <typename TYPE>
  const char* GetTypeName()
  {
    return typeid(TYPE).name();
  }

Note that this is not using any runtime aspects of C++ RTTI support beyond calling a function in the type_info object returned, which needs to store its information somewhere in memory. Technically this means you should not be penalized for its use at runtime (it doesn't alter the size of any of your objects) but I've only tested this on MSVC platforms. This is a remarkably simple solution that doesn't require you to specialize GetTypeName. One issue you can encounter is that RTTI is a very loosely standardized feature of C++ and different platforms may return different names for each type. I believe GCC mangles the result in some way, although none of these issues are insurmountable. Curiously, if you disable RTTI in an MSVC project, typeid still works, demonstrating the theory of no runtime penalty. However, other compilers such as GCC fail to compile. Another non-standard side-step involves the use of the pre-defined __FUNCTION__ (C99) macro or whatever equivalent your compiler most likely has these days:

  template <typename TYPE>
  const char* GetTypeName()
  {
    // GCC's equivalent is __PRETTY_FUNCTION__
    return __FUNCSIG__;
  }

This is MSVC-specific but the output is:

  const char *__cdecl GetTypeName<struct MyType>(void)

An important realization is that this string is unique if your type name is unique so you don't really need to change it. If that bothers you, however, you can do a quick parse of the string and cache it locally in a static char buffer, or similar. A final method I'd like to cover is one which has caught me out in a moment of error:

  #define GetTypeName(type) #type

It's alluringly simple and works, except for one fatal flaw; it can't be used within templates:

  template <typename TYPE>
  void SomeUtilFunction()
  {
    // Returns the string "TYPE"
    const char* type_name = GetTypeName(TYPE);
  }

Types in memory The next step is in defining how types are represented in memory, their dynamic retrieval and how they can be used to create objects of that type. This involved a few structures:

  // Used for both type names and object names
  struct Name
  {
    unsigned int hash;
    const char* text;
  };

  // Function types for the constructor and destructor of registered types
  typedef void (*ConstructObjectFunc)(void*);
  typedef void (*DestructObjectFunc)(void*);

  // The basic type representation
  struct Type
  {
    // Parent type database
    class TypeDB* type_db;

    // Scoped C++ name of the type
    Name name;

    // Pointers to the constructor and destructor functions
    ConstructObjectFunc constructor;
    DestructObjectFunc destructor;

    // Result of sizeof(type) operation
    size_t size;
  };

  // A big registry of all types in the game with methods to manipulate them
  class TypeDB
  {
  public:
    // Example methods; implementations discussed later
    Type& CreateType(Name name);
    Type* GetType(Name name);
  private:
    typedef std::map<Name, Type*> TypeMap;
    TypeMap m_Types;
  };

We didn't use the STL to define our types; I'm using it above to demonstrate intent. The Name type always stored both the null-terminated string pointer and hash of that string. Type names were always present, stored in a read-only segment of memory when the compiler encounters any calls to GetTypeName. Object names were never stored in memory. Instead, they were stored in a MySQL database which was queried by a Visual Studio debugger plugin whenever it wanted to display a name in the watch window. This was a great way of always having object names present in console builds without consuming runtime memory, even in our most final release builds. Types are created dynamically as part of game initialization and a simple implementation of CreateType would be:

  template <typename TYPE>
  Type& CreateType(Name name)
  {
    // Only allocate the type once (GetType will call CreateType if the type doesn't exist)
    Type* type = 0;
    TypeMap::iterator type_i = m_Types.find(name);
    if (type_i == m_Types.end())
    {
      type = new Type;
      m_Types = type;
    }
    else
    {
      type = i->second;
    }

    // Apply type properties
    type->type_db = this;
    type->name = name;
    type->size = sizeof(TYPE);
    return *type;
  }

  // Example registration on initialization
  TypeDB db;
  db.CreateType<MyType>();

This gives enough information to be able to allocate enough space for objects of a given type but a means of constructing/destructing that object has yet to be defined. In C++ you can't create function pointers to the constructor or destructor (see C++98 12.1 for more info) but given a memory address, they can be called:

  template <typename TYPE> void ConstructObject(void* object)
  {
    // Use placement new to call the constructor
    new (object) TYPE;
  }
  template <typename TYPE> void DestructObject(void* object)
  {
    // Explicit call of the destructor
    ((TYPE*)object)->TYPE::~TYPE();
  }

This now allows CreateType to fully define Type:

  template <typename TYPE>
  Type* CreateType(NAME name)
  {
    // ... alloc type ...

    // Apply type properties
    type->size = sizeof(TYPE);
    type->constructor = ConstructObject<TYPE>;
    type->destructor = DestructObject<TYPE>;
    return *type;
  }

This is enough information to dynamically create objects of a given type, which will be discussed later. Finally, the default type database needs to register all C++ types that were supported in its constructor:

  TypeDB::TypeDB()
  {
    CreateType<char>();
    CreateType<short>();
    CreateType<int>();
    CreateType<float>();
    // ... and so on ...
  }

Fields Each type contains an array of fields that describe it – called a PropertyInfo in SC5. We needed the field descriptions to be able to serialize and inspect any object, requiring the following structure:

  struct Field
  {
    // C++ name of the field, unscoped
    Name name;

    // Name of the field type name and a pointer to its type
    Name type_name;
    Type* type;

    // Is this a pointer field? Note that this becomes a flag later on...
    bool is_pointer;

    // Offset of this field within the type
    size_t offset;
  };

Only value and pointer field types were supported and the const-ness was irrelevant. We wanted a means of automatically generating the properties of a field to avoid manually-specified registration errors. A typical structure with the desired registration mechanism could look like this:

  struct MyType
  {
    int x;
    float y;
    char z;

    OtherType other;
    OtherType* other_ptr;
  };

  Field fields =
  {
    Field("x", &MyType::x),
    Field("y", &MyType::y),
    Field("z", &MyType::z),
    Field("other", &MyType::other),
    Field("other_ptr", &MyType::other_ptr)
  };

  // Create the type and specify its fields
  TypeDB db;
  db.CreateType<MyType>().Fields(fields);

Of course, while the properties of a field are automatically deduced, the specification of what fields comprise a type is manual. This caused a few errors along the way on our small team and it was thought that the effort involved trying to minimise this wasn't a priority. Note that the field for OtherType is created before OtherType is potentially created. To offset the need to register types in any specific order, any calls to GetType would allocate the type if it didn't already exist. Subsequent calls to CreateType would retrieve the allocated copy and describe it. Note also that the field name is manually specified, which is another potential source of user error. Instead, we could have used the pre-processor stringizing operator to ensure it was in sync. Again, this wasn't considered important and at the time and I didn't fancy adding extra layers with pre-processor macros. I did start playing around this a couple of years ago and some investigative results can be found in Reflectabit's serialization tests. I still prefer the manual solution, however. Implementing the Field constructor requires answering three questions at compile-time:

What type is the field?
Is it a pointer?
What is the field offset?

This required two utility functions that used partial template specialization on the type qualifiers to identify a pointer and also strip a pointer from the type:

  // Does a type specification contain a pointer?
  template <typename TYPE>
  struct IsPointer
  {
    static bool val = false;
  };
  // Specialize for yes
  template <typename TYPE>
  struct IsPointer<TYPE*>
  {
    static bool val = true;
  };

  // Exactly the same, except the result is the type without the pointer
  template <typename TYPE>
  struct StripPointer
  {
    typedef TYPE Type;
  };
  // Specialize for yes
  template <typename TYPE>
  struct StripPointer<TYPE*>
  {
    typedef TYPE Type;
  };

It also required the use of the offsetof macro, making the assumption that we never use virtual or multiple inheritance (we didn't). With these tools, the Field constructor is quite simple:

  struct Field
  {
    template <typename OBJECT_TYPE, typename FIELD_TYPE>
    Field(Name name, FIELD_TYPE OBJECT_TYPE::*field)
      : name(name)

      // Store the type name as we don't have an owning type database yet
      , type_name(GetTypeName< StripPointer<FIELD_TYPE>::Type >())
      , type(0)

      , is_pointer(IsPointer<FIELD_TYPE>::val)
      , offset(offsetof(OBJECT_TYPE, *field))
    {
    }
  };

The Fields method in Type uses the Named Parameter Idiom to assign a field list to a type. This technique is used pretty frequently to make registration as friendly as possible. Fields is implemented using templates to figure out the size of the C array:

  struct Type
  {
    template <int SIZE>
    Type& Fields(Field (&init_fields))
    {
      for (int i = 0; i < SIZE; i++)
      {
        Field f = init_fields;

        // Assign the type pointer using the parent type database        
        // and add to the type's field list

        f.type = type_db->GetType(f.type_name);
        fields.push_back(f);
      }
      return *type;
    }

    // New vector of fields for this type
    std::vector<Field> fields;
  };

Inheritance All registered types could only have one base class and fields declared within that type only existed in the fields array of that type (i.e. the fields within an inheritance hierarchy weren't merged). Registering a base class was done with another method in Type:

  struct Type
  {
    template <typename TYPE>
    Type& Base()
    {
      base = type_db->GetType(GetTypeName<TYPE>());
    }

    Type* base_type;
  };

  // Example registration
  TypeDB db;
  db.CreateType<SomeType>().Base<ItsBaseType>();

Enumerations Similar to what was described in part 1 of this series, enumeration constants were simply a name/value pair:

  struct EnumConst
  {
    EnumConst(Name name, int value) : name(name), value(value) { }
    Name name;
    int value;
  };

However, in an attempt to keep the API simple and avoid any inheritance trees, there was no enumeration type. Instead, each type had a list of enum constants which would be empty if the type was not an enumeration:

  struct Type
  {
    template <int SIZE>
    Type& EnumConstants(EnumConst (&input_enum_consts))
    {
      for (int i = 0; i < SIZE; i++)
        enum_constants.push_back(input_enum_consts);
    }

    // Only used if the type is an enum type
    std::vector<EnumConst> enum_constants;
  };

Registering of enumeration constants then became as easy as:

  enum TestEnumType
  {
    VAL_A, VAL_B, VAL_C
  };

  // Collate the enum constants
  EnumConst enum_consts =
  {
    EnumConst("VAL_A", VAL_A),
    EnumConst("VAL_B", VAL_B),
    EnumConst("VAL_C", VAL_C),
  };

  // Create the enum type
  TypeDB db;
  db.CreateType<TestEnumType>().EnumConstants(enum_consts);

Again, this relies upon manual registration of any enumeration constants you have and can be error-prone if you forget to add a constant or incorrectly name it. I can't recall this causing us any issues but the potential for making mistakes was there. We were careful and determined that going any further would not be a good investment of our time. Attributes Attribute systems can get pretty complicated but we wanted something quick and simple that could be improved at a later date if necessary. We pretty much knew from the outset what attributes we would require so each field was extended to contain:

  struct Field
  {
    Field& Flags(unsigned int f)
    {
      flags = f;
      return *this;
    }

    Field& Desc(const char* desc)
    {
      description = desc;
      return *this;
    }

    Field& Group(const char* g)
    {
      group = g;
      return *this;
    }

    // An ORing of boolean attributes and a version number (explained later)
    unsigned int flags;

    // An optional property description for editors
    Name description;

    // An optional user interface grouping node name for editors
    Name group;
  };

These are entirely hard-coded attributes that the reflection system defines. There are only two string attributes: description and group, that are used for user interface population. If you're defining a material type then you can group its properties into (for example) "Textures" and "Lighting". The rest of the attributes were boolean flags, merged into the single flags field. Some examples are:

  enum Flags
  {
    // Is this field a pointer type? (replacing the is_pointer bool in Field)
    F_Pointer = 0x01,

    // Is this a transient field, ignored during serialization?
    F_Transient = 0x02,

    // Is this a network transient field, ignored during network serialization?
    // A good example for this use-case is a texture type which contains a description
    // and its data. For disk serialization you want to save everything, for network
    // serialization you don't really want to send over all the texture data.
    F_NetworkTransient = 0x04,

    // Can this field be edited by tools?
    F_ReadOnly = 0x08,

    // Is this a simple type that can be serialized in terms of a memcpy?
    // Examples include int, float, any vector types or larger types that you're not
    // worried about versioning.
    F_SimpleType = 0x10,

    // Set if the field owns the memory it points to.
    // Any loading code must allocate it before populating it with data.
    F_OwningPointer = 0x20
  };

As evident, you could only add new boolean attributes if there were flag bits left and adding attributes with more complexity required extending Field. Perfect for our use-case but not ideal for a more general system. Containers Our container support was very primitive, similar to the container support in Reflectabit but lacking its completeness. The basic premise was to use an interface pointer to an underlying container wrapper, like so:

  struct IContainer
  {
    virtual int GetCount() = 0;
    virtual void* GetValue(void* container, int index) = 0;
    // ...etc...
  };

  // An example container implementation for std::vector
  template <typename TYPE>
  struct VectorContainer
  {
    int GetCount()
    {
      return ((std::vector<TYPE>*)container)->size();
    }
    void* GetValue(void* container, int index)
    {
      return &((std::vector<TYPE>*)container)->at(index);
    }
  };

  struct Field
  {
    // The first constructor, specified above
    Field(// ...

    // An overload for std::vector container fields
    template <typename OBJECT_TYPE, typename FIELD_TYPE>
    Field(Name name, std::vector<FIELD_TYPE> OBJECT_TYPE::*field)

About the Author

Eric Caoili

Blogger

Eric Caoili currently serves as a news editor for Gamasutra, and has helmed numerous other UBM Techweb Game Network sites all now long-dead, including GameSetWatch. He is also co-editor for beloved handheld gaming blog Tiny Cartridge, and has contributed to Joystiq, Winamp, GamePro, and 4 Color Rebellion.

See more from Eric Caoili

Related Topics

Related Topics

Recent in More

Related Topics

Reflection in C++: The simple implementation of Splinter Cell

About the Author

Latest News

Trending

Featured Blogs

Related Topics

Related Topics

Recent in More

Related Topics

<span class="ArticleBase-LargeTitle">Reflection in C++: The simple implementation of <em>Splinter Cell</em></span>Reflection in C++: The simple implementation of Splinter Cell

About the Author

Latest News

Trending

Featured Blogs

Reflection in C++: The simple implementation of Splinter Cell