informa
/
Programming
Featured Blog

Is Your Code Ready to Prevent the Nedelin Catastrophe?

The article presents solutions on how to write more reliable game code and avoid common mistakes in video game programming.

On October 24, 1960, a massive explosion shook the ground at Baikonur Cosmodrome in the Soviet Union [1]. During preparation for a test flight, the rocket accidentally exploded, causing a fire and massive destruction. It was a very sad day as more than 70 people lost their lives. The USSR strategic rocket program suffered a big step back. The catastrophe was named after Mitrofan Nedelin, the commanding officer of the project, who died in the explosion.

Let us retrace the steps that led to the catastrophe [1-4]:

  1. Engineers were working on a very tight schedule (the rocket had to be launched by November 7, the anniversary of the Bolshevik Revolution).
  2. Because of the lack of time, many safety procedures were abandoned or not followed correctly.
  3. Many incidents and defects that occurred during launch preparation were ignored or not carefully followed up to analyze the consequences.
  4. On the day of the explosion, to speed the launching process, multiple tests were conducted at the same time in the wrong order.
  5. Due to delays with rocket start-up, the engineers continued to work directly at the launch pad. Many people were allowed to be there without any justification.
  6. It is believed that someone in the control room saw that the button for the second stage engines was in the wrong position and, without any warning, moved it back to the initial position.
  7. Other components (like the battery connection) were also incorrectly positioned, leading the system to execute a “start” order and ignite the second stage engines, causing the unprepared rocket to explode.

The Nedelin catastrophe at Baikonur Cosmodrome

Figure 1. The Nedelin catastrophe at Baikonur Cosmodrome [4].

The account of the events leading to the catastrophe inspired me to write this article.

Comparison with game development

Video game programming is not safety-critical and doesn’t pose a threat to human life. However, it still raises a number of challenges regarding the reliability of our solutions. No players like corrupted save data, broken gameplay features or a crash in the middle of a winning match.

In this article, I will present a list of common problems in game programming to help you recognize risky patterns and localize potential problems in your game code. Specifically, I will describe:

  • Initialization, update and deinitialization patterns
  • Bad input data
  • Too much control exposed in data
  • Dereferencing null pointer
  • Big classes
  • Division
  • Vector normalization

Also, I will discuss how we can improve our code and prevent common errors. I will be using examples written in C++.

Defensive programming and Design by Contract™

I will start with two special programming techniques. The first is “defensive programming” and the second is “design by contract.”

A program written using defensive programming should properly recover even in unexpected situations [5, Chapter 8][6]. It should not trust the input data, always validate arguments and provide default values in the case of broken input. Here is an example of a function written with defensive programming:

float GetSquareRoot(float Argument)
{
    if (Argument <= 0.0f)
    {
        return 0.0f;
    }
    return sqrt(Argument);
}

Even if the client provides wrong input, the function returns a default result of zero.

The second technique, design by contract, sets up a relationship between the code and its clients [7][8]. The code needs to specify preconditions, postconditions and invariants. Preconditions are obligations for the clients to be true before the code starts. Postconditions are obligations for the code, which it promises to be true at the end. Invariants are conditions that should not change during execution of the code. Conditions can be expressed using:

  • Assertions [9][10][11]
  • Comments
  • Any other appropriate way (for example, a set of tests)

Typically, assertions are disabled during compilation of the final customer build. An example of the same function using design by contract:

// Argument needs to be equal to or greater than zero.
float GetSquareRoot(float Argument)
{
	assert(Argument >= 0.0f );
	return sqrt(Argument);
}

The contract is specified here using assertion and commentary.

Both techniques, defensive programming and design by contract, help us to write better software. However, I believe that game programming deserves special treatment due to its unique circumstances:

1. Maximum execution speed is essential for games.

Defensive programming cannot be used everywhere in the code, because it would result in bloated code if every possible input was validated, and this additional code would likely introduce its own issues. Our game would suffer severely in terms of execution speed. Also, too much defensive code can hide important issues if errors are bypassed silently.

2. Programming is done in a rapidly changing environment where programmers need to quickly take new ideas and run with them.

Game programming involves rapid development cycles where ideas need to be proven fast. But creating code full of unsafe solutions without any defensive approach is a recipe for serious problems in the future.

3. The code we write is also used as a product during development (not only after publishing).

When we work on a new gameplay feature, a new tool as part of an editor or improvements for an engine pipeline, our first users are other team members, designers and artists. These people also create the game and use our software as a tool. This means that our code lives in two realities while it is being worked on: the development world and the customer world, even before it is released to players. Using assertions and forcing the game (or editor) to exit when a condition is false might be good for a programmer, but it is the last thing a designer wants when creating new content for a game.

Additionally, your perception of the problem will depend on the specific needs of your game, team and type of work. Rendering programmers might favor speed at all costs, while AI programmers want more stable code with fail-safe solutions.

We will start our review of typical problems with one of the most common base designs for game programming classes.

Initialization, update and deinitialization patterns

Let us consider an example of implementation of the RAII (Resource Acquisition Is Initialization) concept [12, Item 13]:

class MyClass
{
public:
    MyClass()
    {
        m_Data1 = new DataClass1;
        m_Data2 = new DataClass2;
    }
    ~MyClass()
    {
        delete m_Data1;
        delete m_Data2;
    }

    void Update(float FrameTime)
    {
        m_Data1->Update(FrameTime);
        m_Data2->Update(FrameTime);
    }

private:
    DataClass1* m_Data1;
    DataClass2* m_Data2;
};

We create two objects in the constructor and delete them in the destructor. Also, this class contains a very typical method called Update to perform certain operations on the data in every frame.

Now, let us remove memory management from the constructor and destructor and create specific methods to handle these data resources. This often happens when we want to avoid executing too many operations in one place. Usually, resource management is a costly operation, so we would like to have better control of when it happens in the game. We will create two new methods: Initialize and Deinitialize to handle memory management.

Here is the new version of the class:

class MyClass
{
public:
    MyClass() { }
    ~MyClass() { }

    void Initialize()
    {
        m_Data1 = new DataClass1;
        m_Data2 = new DataClass2;
    }

    void Deinitialize()
    {
        delete m_Data1;
        delete m_Data2;
    }

    void Update(float FrameTime)
    {
        m_Data1->Update(FrameTime);
        m_Data2->Update(FrameTime);
    }

private:
    DataClass1* m_Data1;
    DataClass2* m_Data2;
};

This code still looks very simple, but it has several big problems:

Problem #1. There is no proper initialization for our member pointers in the constructor.

Problem #2. What if the client calls the Initialize method twice in a row? Unfortunately, we have a memory leak here.

Problem #3. There is a similar problem with the Deinitialize method. It works well when the client properly calls Initialize first and then Deinitialize. However, this method breaks down when the client calls Deinitialize twice in a row [13] – this creates a double delete problem.

Problem #4. Another problem is our Update method. What happens if the client calls this method without calling Initialize first? A crash will occur when dereferencing uninitialized or zero pointer m_Data1.

Problem #5. Copying the object of this class creates a possible double delete problem.

Note that we are not specifying any design by contract here. We simply want to consider possible usage patterns by clients of this class to get an idea of common problems.

To modify this code and make it correct, stable and fail-safe, we need to:

  1. Add proper initialization for our member variables.
  2. Add a Boolean member variable m_Initialized to indicate that the object is properly initialized and use this variable in our methods.
  3. As a standard practice, we need to always clear the pointer after memory deallocation is executed [13].
  4. Call the Deinitialize method from the destructor to be sure that memory is deallocated even if the client does not call Deinitialize.
  5. Declare the copy constructor and copy operator private to simply prevent copying.

We will split the correct code into two parts. Here is the code for the header file:

class MyClass
{
public:
    MyClass();
    ~MyClass();

    void Initialize();
    void Deinitialize();

    void Update(float FrameTime);

private:
    MyClass(const MyClass&);
    MyClass& operator=(const MyClass&);

    bool m_Initialized;
    DataClass1* m_Data1;
    DataClass2* m_Data2;
};

Here is the code for correct implementation:

MyClass::MyClass()
    : m_Initialized(false)
    , m_Data1(nullptr)
    , m_Data2(nullptr)
{
}

MyClass::~MyClass()
{
    // Prevent a memory leak if the client does not call Deinitialize.
    Deinitialize();
}

void MyClass::Initialize()
{
    if (m_Initialized)
        return;

    m_Data1 = new DataClass1;
    m_Data2 = new DataClass2;

    m_Initialized = true;
}

void MyClass::Deinitialize()
{
    if (!m_Initialized)
        return;

    delete m_Data1;
    m_Data1 = nullptr;

    delete m_Data2;
    m_Data2 = nullptr;

    m_Initialized = false;
}

void MyClass::Update(float FrameTime)
{
    if (!m_Initialized)
        return;

    // Defensive approach to check the pointers separately.
    if (!m_Data1 || !m_Data2)
        return;

    m_Data1->Update(FrameTime);
    m_Data2->Update(FrameTime);
}

Assumptions about the expected number and order of method calls by clients of the class are very common issues. This is an especially challenging problem for bigger classes with a significant number of methods.

We can find similar solutions in Unreal EngineTM, for example in the class UActorComponent [14]. The RegisterComponentWithWorld and UnregisterComponent methods use member variable bRegistered to indicate the state of the object. The UActorComponent class contains much more similar pairs of methods: BeginPlay and EndPlay, InitializeComponent and UninitializeComponent, etc.

Bad input data

Bad input data is a problem that has existed since the beginning of our industry [15]. Missing input files like meshes or unexpected values for parameters that cause a game or editor to crash are something that no developer likes.

In general, we as programmers should be especially sensitive to this problem. Our code needs to be written in such a way that we provide default values for missing input, like default textures or visual objects in place of a wrong file, as well as useful info to the user in the form of a warning message or log entry [16, Chapter 3.3.2.2, page 147]. No program should crash in such a case.

In most game engines, it is very easy to expose any variable as a data parameter in the editor. Let us consider the parameter to control the number of objects created in one command:

int NumberOfProjectilesToSpawn;

Probably we indirectly assume that this variable will be 0, 1 or maybe 5. But what will happen if the user either mistakenly or intentionally sets an extremely high value, like 1,000,000, or a negative number? Will our code react correctly?

Good solutions to consider here are the default value and limits in the range of possible values. Alternatively, if we want to allow any value without limiting the maximum number, we can consider warning the user that the input value may cause the program to become unstable.

In the defensive programming–based approach, we make a clear distinction between risky input and safe input. We can consider all external input as always untrusted, but some internal data can be trusted if previously validated [5, Chapter 8.5, Figure 8-2]. As such, we could mark as always untrusted all external files, all data exposed in the editor, all player input, all data from the network, etc. The validation and correction process needs to convert such risky data into trusted data:

void SetNumberOfProjectilesToSpawn(int NumberOfProjectilesToSpawn)
{
    m_NumberOfProjectilesToSpawn = NumberOfProjectilesToSpawn;
    ValidateAndCorrectParameters();
}

void ValidateAndCorrectParameters()
{
    m_NumberOfProjectilesToSpawn = clamp(m_NumberOfProjectilesToSpawn, 0, 8);
}

From the perspective of game programming, it would be best to validate input data as early as possible. For example, immediately after entering parameters in the editor, before saving data or just after loading data for the object. Marking data as trusted as early as possible makes it easier to avoid unnecessary checks in the runtime code critical for game performance.

Too much control exposed in data

Taking the problem of bad input data a step further, we come across whole systems that are data-driven. Such systems can be configured and controlled without the programmer, entirely through the set of input data.

It is useful to expose the full system control or game feature to every team member, especially non-programmers. But programmers need to ask themselves: do we support every possible combination of values for the given set of parameters? Or will the game crash when something unexpected is provided as input?

Creating a system that is fully data-driven is very hard and time-consuming [16, Chapter 14.3, page 857].

When we expose another variable as a new parameter, it becomes part of a bigger set of data-driven values. Automatically, we add a new layer of complexity and dependency between the code and the data.

Let us consider a simple structure to control enemy behavior, for example:

struct EnemyBehavior
{
    float m_WalkSpeed;
    float m_TurningRate;
    bool m_JumpAllowed;
    float m_JumpSpeed;
    // ...
};

Does the code support a very low value for m_WalkSpeed and a very high value for m_TurningRate at the same time? Is jumping still executed correctly when m_WalkSpeed is very high?

A safer solution might be to prepare predefined sets of parameters in the code that we know are supported in implementation. The user is given a set of presets defined in the code and exposed in the editor:

enum EnemyBehaviorPreset
{
    EnemyBehaviorPreset_TightCorridors,
    EnemyBehaviorPreset_IndoorHalls,
    EnemyBehaviorPreset_OutdoorOpenSpace
};

We provide the user with only one variable that can be changed in the editor:

EnemyBehaviorPreset m_EnemyBehavior;

This way we limit the user’s freedom in defining the behavior, but our solution is more stable and manageable by programmers.

Dereferencing null pointer

Dereferencing null pointer is one of the main reasons for instability causing the game or editor to crash [17].

In Unreal Engine 2 and 3, one of the designing goals for the scripting language UnrealScript was to prevent this type of problem and create “a pointerless environment” [18]. Accessing object references in UnrealScript was always safe, even if they did not refer to any object (making the operation have no effect).

Because execution speed is so essential for games, we cannot add a conditional instruction in C++ before every possible pointer dereference. However, the policy on how to work with pointers may differ depending on the department. Rendering programmers can sacrifice defensive mechanisms to gain speed at all costs (adding only asserts to detect cases with null pointers), while gameplay or AI programmers might favor a more defensive approach because of frequent changes in gameplay mechanics and code evolution. Therefore, it is important to determine your policy when working with pointers.

Avoiding pointers

First, we can avoid using pointers as function parameters and use references in C++ instead [17]. So, instead of writing the function as follows:

void MyFunction(ControlData* InputData)
{
    if (InputData)
    {
        int parameter1 = InputData->GetParameter1();
        // ...
    }
}

We can rewrite it using a reference:

void MyFunction(ControlData& InputData)
{
    int parameter1 = InputData.GetParameter1();
    // ...
}

This way, the caller of the function is responsible for providing the proper argument (and checking and dereferencing the pointer if needed).

Public and private methods

In practice, specifying the contract for a non-null pointer as the argument in the public method tends to be one of the most commonly broken contracts.

If a method is exposed as part of the public interface, I recommend checking its pointer arguments against null:

void MyClass::AnalyzeInputData(Data* Input)
{
    if (!Input)
        return;
    // ...
}

If a method is part of a private implementation, I can consider removal of the early out code. In such a case, there needs to be a proper description in the comments or the method name should warn about no tests for zero pointers:

class MyClass
{
public:
    // Method always tests if the pointer argument is null.
    void AnalyzeInputData(Data* Input); 

private:
    void UpdateObject_NoTestForZeroPointer(ObjectClass* Object);

    // Method does not check validity of pointer arguments:
    void UpdateObject(ObjectClassA* Object1, ObjectClassB* Object2);
};

Additionally, at the beginning of private methods UpdateObject and UpdateObject_NoTestForZeroPointer, we can validate pointers using assertions:

// Method does not check validity of pointer arguments:
void MyClass::UpdateObject(ObjectClassA* Object1, ObjectClassB* Object2)
{
    assert(Object1);
    assert(Object2);
    // ...
}

However, this might not be the most practical approach if our assertion interrupts execution of the editor for other team members. What can greatly improve stability here is to create a “soft” assertion with a conditional defensive approach.

“Soft” assertions

Soft assertions should allow the code to proceed (for example, for other team members) while still reporting the issue to the programmer:

void MyClass::UpdateObject_NoTestForZeroPointer(ObjectClass* Object)
{
#ifdef CONTRACT_SAFE_CHECK
    reportIfNull(Object); // Soft assert
    if (!Object)
        return;
#endif
    // ...
}

This way we inform the programmer about the abnormal situation (by sending the info to the database with errors) without interrupting execution. Using the #ifdef block we can control in which compilation configuration we want to include our defensive block. Development configurations should include the block while the profiling or final compilations can exclude it.

Preconditions expressed in such a way should be considered if continuing to run the editor or game does not create other issues.

Big classes

Some classes tend to grow much faster than others. Some source files tend to be much bigger than others. If you check the current game you are working on, you will most likely find a couple files that are significantly bigger than the others. Such classes typically are responsible for characters, players or general game managers. This is not a surprise. When we add new features to the game, even in small steps, we add them to already existing classes. But such incremental growth is very problematic when nearing the critical point. It becomes very hard to maintain preconditions, postconditions and invariants. Maintaining big classes is difficult, and they make serious problems much more likely. Finding the issues described in the “Initialization, update and deinitialization patterns” section is easy. Most likely, they indicate that the class contains sub-elements that are not properly expressed as separate encapsulated functionalities. There are several possible ways to solve this problem. I would like to present one of them here.

Whenever I need to add some new small element to an already existing class, I start by evaluating whether I can create a nested class within the parent class. Consider adding new member variables and methods to implement “Special Action 1” in the ObjectBehavior class:

class ObjectBehavior
{
public:
    // ...
private:
    // ...
    // Begin Special Action 1.
    bool m_ShouldStartSpecialAction1;
    float m_TimeLeftToStartSpecialAction1;
    float m_SpecialAction1Parameter;
    void InitializeSpecialAction1();
    void DeinitializeSpecialAction1();
    void UpdateSpecialAction1();
    // End Special Action 1.
    // ...
};

Implementation using the nested class:

class ObjectBehavior
{
public:
    // ...
private:
    // ...
    class SpecialAction1
    {
	public:	
        void Initialize();
        void Deinitialize();
        void Update(ObjectBehavior* ParentObject);
    private:	
        bool m_ShouldStart;
        float m_TimeLeftToStart;
        float m_Parameter;
    };

    SpecialAction1 m_SpecialAction1;
    // ...
};

What is better here? We nicely encapsulate new variables in a separate class in a private section. Only the new SpecialAction1 nested class has access to them. This will also make it much easier to separate these classes further in the future if needed.

Division

Video games use a lot of math operations on floating-point numbers. For example, we frequently perform division in our game code. This is a very basic math operation, yet still there is the edge case of division by zero. If we try to perform division by zero we end up with a floating-point exception or, if this exception is masked [19], the value for the resulting float variable will be “Infinity” (INF) or “Not a Number” (NaN).

Please also consider the following example, where both arguments are valid non-zero floating-point numbers, but the result of division is infinity (because it falls outside the representable range of a 32-bit float):

fl

Latest Jobs

Sucker Punch Productions

Bellevue, Washington
08.27.21
Combat Designer

Xbox Graphics

Redmond, Washington
08.27.21
Senior Software Engineer: GPU Compilers

Insomniac Games

Burbank, California
08.27.21
Systems Designer

Deep Silver Volition

Champaign, Illinois
08.27.21
Senior Environment Artist
More Jobs   

CONNECT WITH US

Register for a
Subscribe to
Follow us

Game Developer Account

Game Developer Newsletter

@gamedevdotcom

Register for a

Game Developer Account

Gain full access to resources (events, white paper, webinars, reports, etc)
Single sign-on to all Informa products

Register
Subscribe to

Game Developer Newsletter

Get daily Game Developer top stories every morning straight into your inbox

Subscribe
Follow us

@gamedevdotcom

Follow us @gamedevdotcom to stay up-to-date with the latest news & insider information about events & more