Things that can muddle your replay feature

This blog post lists things that might cause problems when you implement replay feature to your game. It also has some suggestions about avoiding those issues.

Kaarlo Raiha, Blogger

September 30, 2016

14 Min Read

Introduction

Replay features (in this text term replay means that player can see a replay of recent game event after it has happened) are very useful for games. They can be used to drive retention, to guide players, to encourage social gameplay, to fast track bug reporting, to help performance testing etc.

Implementing a replay feature to your game isn't usually a trivial task, since nowadays games have so many moving components that should be tracked for replays. Because replay feature might require work to multiple parts of the game engine, it is much easier to implement replay feature to game while it is in development, instead of patching replay feature to game after it has been released.

As rule of thumb, you should decide early on if you want the replay feature to your game. If you choose to implement one then I would suggest that you test it out multiple times during the project to make sure it works as intended. Poorly working replay feature implementation can lead to e.g. game crashes or cheating accusations.

Types of replay

There are three different types of replays in computer/video games:

1. Record player or players input while game is happening and recreate gameplay from that input when player watches the record.

e.g. player 1 pushes Up button in frame 456, Down button in frame 531 and Attack button in frame 556.

2. Record needed states of objects in game during different frames and recreate gameplay from those states (interpolate if needed)

e.g. in frame 32 box has position x:145 y: 32 and player character has position x: 86 y: 54 with animation sprite #7 shown

3. Capture video from gameplay and replay that video file

e.g. in iOS device use ReplayKit to record gameplay

In this blog post I am talking about types of 1. and 2. since video playback doesn't require recreating the gameplay from game engine. Type 1. replay is most error prone to implement but those input based replays also take least amount of storage space and are the best option for bug replication. Type 2. replays take more storage space than type 1. replays but are usually easier to implement.

Naturally you can combine replay types together (e.g. with AR game you can store video for background, and frame specific player inputs from controller to get your replays) but it is usually easier to stick with one replay type.

Deterministic system

As the topic says, if you want to implement proper type 1. replay feature for your game, you need to have Deterministic system. With deterministic system I mean that same input should always produce same output. Your game engine doesn't have to be completely deterministic to support replay feature, but it has to support some kind of override mechanism to playback those recorded replays.

In most games you don't need pixel accurate replay of video output, and you don't need sample accurate audio output replay, so you can easily cut some corners while you implement replay feature to your game. Nobody is going to notice if e.g. your particles drop a bit differently every time the replay is played. But if the replay sometimes ends in different gameplay state (e.g. red team wins the round during the replay while blue team actually won the round during gameplay) then you are in trouble.

Naturally this requirement for deterministic system also applies to type 2. replays if you don't store every state of every object during every frame (e.g. if you only store every Nth frame as key frame and interpolate between them).

So if you have to calculate game states during replay playback, make sure those calculations are exactly same as they were when that game event actually happened.

Problems:

1. Random number generation (RNG)

If your game uses random number generator, it is possible that it will cause problems for your replay feature. You should always use a deterministic random bit generator (DRBG) when you need random numbers for your gameplay. And when you use DRBG, make sure you store the seed of RNG initialization to replay file, and that you query numbers from it in same order. That way you can guarantee that RNG provides same numbers during the replay as it did during gameplay.

- Examples of RNG that will break your replay -

* During gameplay archer shoots swordman with bow and RNG rolls damage of 6 points (RNG range in damage is 4-8). Since RNG state wasn't stored to replay file, RNG uses default or current state of random number generation. That mistake leads to situation where the replay RNG spits out damage of 8 points.

* During gameplay archer's AI script is handled first, and it hits the swordman with critical hit (roll 1 from range 1-100), and the swordman gets stunned and cannot attack. During the replay swordman's AI script goes first (because order is wrong) and this time it rolls 1 and archer get stunned.

While some systems might give you hardware based random number generators, those are bad for replays since they usually don't guarantee determinism. Also, if you have to code your own RNG, then make sure it is DRBG by NOT using things like CPU usage percentage, system thread count, uninitialized variables or process ID in number generation.

It is easier to detect and fix RNG related problems if you create one single class for RNG related functions, and make sure that only it is used for generating random numbers. Also you should create unit test cases for those RNG functions to test out their determinism.

You might also want to separate RNG for gameplay logic and for visual effects. That way playback device specific visual changes (e.g. less particles on low quality settings) won't alter the replay playback.

2. Floating point math

Hardware based floating point values and functions that operate on floating point values can easily break your replays. The first big problem is related to optimizations, since output of floating point operation can produce different results when done with optimized binary vs. unoptimized binary. This means that replay from release version of the game can produce different playback in debug version of the game.

Second big problem with hardware based floating point math is portability. If you run your replay on different hardware (or use different compiler) you might get different results. This might become an issue if you have to verify client replays in server, because server could flag some of those replays as cheated ones if their states don't match.

Floating point related differences between original gameplay and replay might be super hard to spot with naked eye. This is because those differences are usually very small and/or they might even cancel out each others. So it is possible that some of the replays work correctly while others have minor differences between original gameplay and replay.

In some cases you might be able to design your game in such way that it doesn't use floating point math for gameplay (e.g. all positions are integers). Or if you need real numbers in your gameplay and you must be sure that they work always on same manner with every supported platform then you can choose to use Fixed-point arithmetics.

3. Script/code execution order

This part doesn't apply to all the games, since some game engines have hard coded code execution order (which guarantees that events are always played out in same manner). But if e.g. processing order of your NPCs can be different between runs, then you should make sure that during the replay the order is exactly same as it was during gameplay.

- Examples of script execution order that will break your replay -

* During gameplay swordman gets two hits from arrows (archer #1 shot first arrow, archer #2 shot second arrow) and dies. Since archer #2 shot the latter one, he/she gets kill score. During the replay order of archers is reversed, so this time archer #1 gets the kill score.

Keeping the right order becomes more difficult in situations where scripts are dynamically added/removed to execution engine during gameplay. And other thing that can bite you is multithreading when it is used to handle multiple scripts of same type at the same time.

4. Replay serialization/deserialization

There are many things during serialization/deserialization that can cause problems for your replays, but here I have listed three that are somewhat common.

First one is endianness which isn't that big of deal nowadays since most gaming platforms are little-endian. Endianness issue is usually very easy to spot, because e.g. little-endian uint32 value 1 turns into 16777216 with big-endian platform. If you have to support both endians then you should choose one of those and save all the replays using that format (I would go with little-endian).

Second one is serialization of floating point values. e.g. if you call following code in C#


float fValue = 123.44251f; 
Console.WriteLine(fValue.ToString());

you get 123.4425 and some accuracy is lost. Right way is to use round-trip formatting when storing floats as strings that have to keep their accuracy


float fValue = 123.44251f;
Console.WriteLine(fValue.ToString("R"));

and this gives output of 123.442513. This is also one of those issues that might be difficult to spot since small differences in floating point values could still lead to end result that seems to be correct.

Third one is order of serialization/deserialization. This means that if event A happened before event B during gameplay then replay data structure should also keep same order of events and replay playback should play events in that order.

5. Physics engines and other third party plugins

"The first problem here is that the PhysX SDK is not deterministic. Especially when running different hardware setups, bus latencies can vary between runs, or on different machines. Even without hardware in the machine, we do not guarantee any type of determinism." PhysX Knowledge Base/FAQ

Since most physics engines aren't deterministic they can cause big problems when you try to implement a replay feature. If you want to use physics engine then you cannot in most cases get accurate replication of outcome with pure input based replays (type 1.) which usually means that you have to go with state recording (type 2.) and partially/completely disable the physics engine while the replay is played.

In some situations this partial disable of physics engine might be a hard problem, since you could still need e.g. triggers and collision matrices but not the gravity. This might lead to situation where you have to process certain objects in game engine differently while replay is played back and restore "normal" behavior to those objects in regular gameplay.

Same applies to all plugins (e.g. AI and path finding), since they might not be deterministic and because of that they might also need to be partially/completely disabled.

With all the plugins the engines are using, it is easy to miss the problems they might cause in certain setups (in some cases it is obvious that certain component does not behave as it should), because you can get exactly same results on every run when you are only testing replay feature on a single device.


(50 cubes dropped from same height one after another via script, 4 runs and 4 different outputs)

6. Game updates (logic or values)

This is something that you should really think about when you design your replay system. If you alter your game logic or some constant values when you update your game, it can break the replays generated with older game version.

Many game designers have chosen to break backwards compatibility of replays when game is updated. Nicer developers do inform players about this, since some players might have replays they really like (and hopefully can convert the old replays to video format before updating).

If only values of certain things are modified in game update (e.g. old archer had damage between 4-6 but after the patch damage is 5-7) it is possible to store config values to replay file when replay is recorded. And during the replay playback those default values will be overridden with recorded values. But many games don't support this because there might be too many config values to store, or too much work to make backwards compatibility to work.

It is always a good thing to store some sort of replay version number into replay file. That way the game knows if it can (or is allowed) to playback some older replay files.

- Examples of game update related changes that will break your replay -

* During gameplay archers shoot some arrows, and this gets stored to replay file. During next game update those archers are completely removed from game, so game engine doesn't playback those old replays anymore.

7. Hacked replay files

If players can share your game replay as separate files (and load those replay files in the game), then you should take care that those replay files cannot be used to damage your users (or their devices). This basically means that you should validate every replay file before it is played.

Hackers can use modified replay files e.g. for client crashing or to execute malicious code. These kind of security flaws can even get you game pulled from the shops, so you should really think if you want to allow players to share replays between them.

If players can only download replays from your servers then you could also validate those replays when they are uploaded to your server.

Additional tips:

When you are implementing type 1. replay (input recording) support, you should store more data than just the input. E.g. in battle game you should store end states of units (hitpoints, positions etc.) and compare those values to the ones you get after the replay playback is done. That way you can easily see if replay playback produced same end state as gameplay did.

With type 2. replays (store states of objects) it might not be possible to store states of every object during every frame since it could slow down the gameplay too much or replay files might become too large. In that case you can e.g. record every fifth frame or only store values that have changed between frames.

It might be wise to set certain time or file size based recording limit if your gameplay doesn't have strict gameplay related time limits. Otherwise those replay files might become too big if player does not stop recording and then they might cause problems for users.

If you need timeline function for your replay playback (meaning that you can jump to any point of the replay) then you might have to unroll the replay file to memory. With type 1. replays this means that you might have add gameplay speed up feature to your engine because otherwise unrolling a replay might take too much time to be useful.

If you give replay speed up (e.g. 2x or 4x) feature to players then you should decide what happens when player's device does not have enough performance to playback those replays at those speeds. Usually this means either that game skips frames or game limits replay speed up factor.

You can also add tools for players that turn type 1. or type 2. replays into video files (or image files + audio file). e.g. Source engine has startmovie console command that can be used for conversion. You can keep this export feature as very simple one since most players that use feature like this have some sort of video editing experience. Players can do cuts, transitions, compression etc. in separate tools and you shouldn't implement full non-linear editing system into your game.

Select your record time system (frames or timestamp) as early as possible. And if you choose frames then you should also lock down the recording framerate (e.g. 30 frames per second) as soon as possible. This is because these selections might have huge impact for your gameplay. Some games only run in-game logic 30 FPS to keep replays as useable, or only store 30 FPS replays while gameplay might run faster.

About the Author(s)

Kaarlo Raiha

Blogger

See more from Kaarlo Raiha

Related Topics

Related Topics

Recent in More

Related Topics

Things that can muddle your replay feature

Introduction

Types of replay

Deterministic system

Problems:

1. Random number generation (RNG)

2. Floating point math

3. Script/code execution order

4. Replay serialization/deserialization

5. Physics engines and other third party plugins

6. Game updates (logic or values)

7. Hacked replay files

Additional tips:

About the Author(s)

Latest News

Trending

Cooking Games Spotlight: Deep Dives, Interviews, and More

Featured Blogs

Game Developer Essentials

Related Topics

Related Topics

Recent in More

Related Topics

<span class="ArticleBase-LargeTitle">Things that can muddle your replay feature</span>Things that can muddle your replay feature

Introduction

Types of replay

Deterministic system

Problems:

1. Random number generation (RNG)

2. Floating point math

3. Script/code execution order

4. Replay serialization/deserialization

5. Physics engines and other third party plugins

6. Game updates (logic or values)

7. Hacked replay files

Additional tips:

About the Author(s)

Latest News

Trending

Cooking Games Spotlight: Deep Dives, Interviews, and More

Featured Blogs

Game Developer Essentials

Things that can muddle your replay feature