Designing Secure, Flexible, and High Performance Game Network Architectures

Jenkins takes a look at possibilities for game-level network architecture for your titles, claiming the right implementation "can abstract away many common tasks, saving time, improving flexibility, and addressing cheating issues that can't be solved easily elsewhere."

Kevin Jenkins, Blogger

December 6, 2004

19 Min Read

Introduction

Multiplayer support in console games has traditionally been a secondary feature, appealing to the hardcore market and generating relatively few extra sales. This has been changing as online gaming goes mainstream, especially since consoles began supporting broadband.

Many developers find themselves required to add multiplayer for the first time, and are in the position of developing multiplayer systems from scratch. The straightforward solution is to implement a reliable packet system and use this to encode and transmit game-level messages. This results in network messages being hardcoded at a low level of abstraction, requires more development time over the life of the project, and may limit your game's feature set. A game-level network architecture can abstract away many common tasks, saving time, improving flexibility, and addressing cheating issues that can't be solved easily elsewhere. While architecture is game specific, this article will present high level ideas that can be used in whole or in part when designing your architecture, along with optimization ideas that can benefit existing systems.

Choosing Network Topology

Any networking architecture deals with three major issues: security, flexibility, and performance. The first thing to consider is what topology to use, as this is probably the single most important issue to resolve. Traditionally, games take one of two approaches: peer-to-peer or client-server. In a peer-to-peer system data is stored on the client and authorized by a consensus system whereas with client-server the server is trusted and stores most to all game data. Peer-to-peer avoids putting network load on a single system. This can be advantageous for data that doesn't need to be routed through a server, such as directed chat messages. Another situation where peer-to-peer becomes beneficial is when you don't have a single system that would be a suitable as a server - such as when all systems are consoles, and no one system has enough bandwidth to handle all combined traffic. Peer to peer communications can provide better performance but a client-server system is more secure and flexible.

An architecture commonly used is a client-server object replication system. In this system, specific objects and data members are transmitted on creation or changes. This can happen explicitly, with a call to serialize and replicate, or implicitly, where data is polled every update cycle and compared to the value it had the last update cycle. This is a good approach. It is straightforward and easy to understand. It fits into existing single player systems. With implementations that replicate objects automatically, scripts can modify data without knowledge of the underlying system and still achieve network synchronization. However, it is less efficient than a hand-tweaked system, sending data independent of context and often redundantly. It is also less secure, creating a strong coupling between game-code and network activity. As game-code changes are made, especially by programmers that aren't familiar with network security, vulnerabilities arise. We're going to expand on this to address these problems efficiently.

The architecture suggestions described herein are intended to provide the following qualities:

1. Cheat-resistant. When applicable, data and some game-code is maintained server side, optionally supplemented with peer-to-peer communications to avoid unnecessary server traffic.
2. Flexible and extensible. Intentional abstraction achieves loose coupling of game code with network code.
3. Low bandwidth utilization. Low level context related optimizations are exposed to higher level routines so scripters can define algorithms and rules by which to send data.
4. High performance. Optimizations and features described should be fast, require low memory utilization, and be optional improvements to existing proven algorithms.

Security and cheat-resistance

Security is important for any game, but especially if you plan to publish for the mass market. The more popular your game is, the more people will try to hack it. As soon as one hack is found it will spread through the ranks of would-be cheaters like wildfire with negative press likely to follow. This article doesn't cover cheating in its entirety, which deserves a whole book of its own (Applied Cryptology by Bruce Schneier is excellent), but we can stop some cheats by making wise architectural decisions.

Some of the most common methods used to cheat:

Method	System Level	Example Game	Description
Bug/Design Exploits	Game	Ultima Online	Attacking other players with firefields so as to not gain notoriety.
Network Attacks	Game-level network system	Counter-Strike	Intentional latency to cause client authoritative damage.
Modified binaries	Binary protection, Game-level network system	America's Army	Modifying memory to see through teargas and bushes
Modified packets	Low level network system	Quake	Modifying direction parameter in a packet - aimbot
Falsified connections	Low level network system	Numerous games	DoS by filling server with client bots

Cheating can be addressed through logging, robust implementation, validation, and architecture. Logging involves recording the last n packets, chat messages, connections and disconnections, invalid packet data, unusual game behavior, and per-client statistical data. Your low level networking system should include packet tampering detection, encryption, and security against falsified connections. Executable and system level protection involves data encryption, obfuscation, and validation. Game network-system issues will be covered throughout this article, beginning with network topologies.

Sensitive data on the server:

The traditional game topology is client-server, with most to all sensitive data kept server-side. This works well in most cases. I'm going to go one step further, and also suggest that game algorithms insofar as practical be kept server-side as well. In this situation, the client doesn't know game related algorithms in advance; it only knows how to process generic commands.

As an example, a traditional implementation of capture the flag has capture the flag implementation code on both the client and the server. The client is usually trusted to handle mundane events such as playing a sequence of animations and position the camera when the 'flag capture' message arrives. In my proposed implementation, no capture the flag code would exist on the client. In its place would be an extended command interface used for all game modes, enabling a high level of control from the server over the client's interface and in-game actions. All algorithms related to capture the flag exist only on the server.

The benefit of the second system is twofold. First, any changes to a game mode or new game modes require only a server-side patch. Second, potential cheaters have no foundation to work from, at least for CTF, because there is no CTF related code on the client. Any hacks would either have to deal with generics, or be discovered and implemented in real-time. The downside of this approach is that it requires more bandwidth - however this can be minimized through techniques discussed later in this article.

Abstract programming is much more difficult to design and implement than concrete programming. In practice what you implement will depend largely on the existing system, advance requirements knowledge, time, and reuse goals. The intent one should strive for is to keep game-specific data out of the client. For example, the game code, on both the server and client, should never assume the type of an object sent over the network as an ID. Data must be validated upon reception, by verifying ID types, and bounding numeric and other data.

Often you don't want one client to know the IP address of another client. Allowing players to know one another's IP's can lead to denial of service attacks and spoofed connections. However, in certain contexts, such as setting up voice communication or with small games, it can be reasonably safe for one client to know the IP of another client. In that case it's worth considering supporting a peer-to-peer system to transmit non-sensitive data directly, rather than through the server. Even with client-server architecture, in some cases it would still be necessary to perform server migration, for example in a 4 player game one player who was acting as the server might disconnect, forcing a different player to assume the role of server.

Sensitive data on the client:

When data is on the client, such as with a peer to peer topology, one method to detect cheating is a trust metric based on certain assumptions:

1. Cheaters are likely to cheat right away, if for no other reason than to check if the cheat works.
2. Cheaters are likely to perform the same cheat many times in a row. A speed hack will speed up your character every frame, rather than one frame.
3. A cheater's data is likely to be widely divergent from the data on the other systems - For example, gain 100 HP instead of gain 1 HP.
4. It's likely that only one or two systems will be doing the same cheat at the same time.

The process is straightforward. A new peer/client is assigned an initial trust rating. This value is continuously increased over time. Every time there is suspicious data from that client, subtract from the trust rating. For a client-server based system, you are done. If x <= 0, or if x drops too quickly, kick that client. With a peer to peer system, if x <= 0, or if x drops too quickly, then compare this value to that of other systems. If a majority of systems all report similar changes to the trust rating, kick the offending system out of the group. Otherwise, set the trust rating back to some average of the systems. These values are all contextual, and I have no specific numbers to recommend. A good rule of thumb is to make unintentional latency cause decreases in the trust rating that are marginally smaller than what is actually considered a cheat.

Efficient Network Messaging

With cheating behind us, it's time to address messaging. Broadly speaking, messaging is used for:

1. Object replication
2. Remote Method Invocation (RMI)
3. Data synchronization

Object replication involves sending a message when an object is created or destroyed by an authoritative system so that this same object is created or destroyed on other systems. Data members of that object may be synchronized as well and inherit the update rules of the object. Remote Method Invocation involves parsing commands and data out of messages, ensuring correct encoding, and mapping the command to a function on the remote machine with the data in the message passed as arguments to that function. Data synchronization involves determining which memory address to update given a system independent identifier, ensuring that this memory address is valid to write to in the given context, and changing the value. As this article is about architecture, I won't go into specific implementations but will instead explore improvements to these operations.

Relevance Based Messaging

Relevance based messaging is the practice of sending only relevant state-change data to relevant systems in order to reduce network traffic. In this case I am referring to object creation/destruction and updates to variable values. Update culling can be performed in both these cases. The most common culling method for objects is distance based, where objects will be created at range X, and destroyed at range X+Y from the target system's field of interest. The most common culling method for variables is to not resend if the data is identical. Obviously, if an object is culled so are its member variables.

The reason more advanced culling methods are not usually used is that they are contextual and we don't know much about the context at the network system level. For example, it may not be necessary to send most position updates if another player is on the opposite side of a impenetrable wall, even if they are nearby. The solution I propose is simple: the common solutions can remain as default behaviors, and expose the culling method or method properties to the higher level in the form of callbacks, overridable functions, a parameter list, or some other method. Architecturally, this means that each system which broadcasts updates needs to know the prior value per-system for each synchronized object or variable. If memory is a concern, we can limit saving prior values to only variables that change frequently.

Encoding Network Commands, Text, And Data

1. Encoding network commands
Games often send the same command frequently. A common operation is to use the first byte of a packet to represent a command identifier. This works but we can shave off a few bits by creating a frequency table of network commands which indicate likely or previously recorded session averages. For example:

Per system:
Movement: Called 84 times
Press trigger: Called 22 times
New player joined: Called 5 times.

We can use this frequency table as input for arithmetic or Huffman encoding, which will then give us an optimal bit-wise representation of each command. Table generation can be performed beforehand so as to not slow down the game at runtime. Clients and servers have different output sets and frequencies so this will need to be done separately for each system.

2. Encoding text commands
We may also apply this technique to commonly sent strings. On connection, a server can send a string table to the client of all strings that are mapped to a unique identifier. More complex implementations may encode special characters in the string for printf style format specifiers. From that point on, one bit can be sent, indicating a string lookup (or not), followed by a string identifier. If desired, this can also be based off some optimal encoding method as was done with commands. Alternatively, the first time a string is sent the server can send the string identifier followed by the string. From that point on the client is expected to use the string identifier. This can work unidirectional, or you can ack string identifiers and use them both ways.

3. Encoding data
Memory and processing permitting, we may also wish to perform global compression. The technique is the same. Generate a frequency table based on previously recorded server output and client output sent from the parsing system. Adaptive run-time compression is possible for data sent reliably and ordered.

A more useful and practical approach than global compression is to write data to bit streams that perform simple run-time contextual compression. The approach used by the network library RakNet is to provide WriteCompressed and ReadCompressed calls that recursively write a <1> if the high half of the data is all 0's Otherwise it writes a 0, then the rest of the data. The approach used by the Torque network engine is to specify the min and max values of a data element by which the minimum number of bits will be used to write that data. The first technique requires less maintenance and is more flexible, while the second technique provides better compression. If you decide it is worthwhile to use the second technique, this can be extended by a further parameter: granularity. Sometimes we don't care whether a value is 503.135224 or 503. By specifying the bits of granularity, we can perform this reduction automatically.

In either case, the option to specify compression parameters should be exposed to the higher level so that scripters or implementers can take advantage of it. In the case of specifying bits, it would be a very good idea to use a utility class to specify parameters and share this class between systems to reduce maintenance and the chance of mistakes.

Packet Content

One way to reduce the number of client network commands as well as reduce the likelihood of cheating is to have the client send inputs, rather than the results of inputs. This is straightforward and can make programming easier in some ways.

Send:
" Movement commands
" Use key command
" Jump key command
" Switch to weapon #5 command.

Don't send:
" Pick up red team's flag
" Capture flag
" Kill player with ID 381
" Switch to weapon 'Sniper Rifle'

The difference is that the set of actions has a one-to-many relationship with the set of input commands. Once you implement the input commands you gain the actions for free, and actions are a moving target while commands tend to stay fixed. This can reduce required maintenance and code-size. From the context of cheating, it limits the amount of code you have to deal with and in the worst-case scenario the client won't be able to do anything they couldn't have done within the normal context of the game. In practice it's not possible to achieve this level of abstraction and still have good game play because entropy will cause clients to get out of synch.

We can address entropy by including relevant data along with client commands. An example would be including position and orientation along with a movement or shoot command. This layer should be exposed to scripters or gameplay programmers because the information to determine what to send is only available at that level. This adds some complexity because we are sending client specific data, which is not trusted, along with commands, which are trusted. This client originated data will have to be parsed at the destination and compared against previous and likely values.

The practice of sending relevant data along with commands is also necessary for the server, for the same reasons. The difference is that clients can trust server data, so aside from value interpolation no additional processing needs to be done. As before, the interface for determining which variables are relevant should be exposed to scripters and gameplay programmers.

Network data summary

Throughout this article I have referred to applying properties to commands, variables, objects, systems, and strings. It may be clearer in certain implementations to aggregate these properties into a single system, whereby you can define all these properties up-front. This table summaries the properties and extra data I've covered:

Per command:

A unique identifier that can be used to find the command in a look-up table.
A bitwise encoding representing this command.
A list of variables and objects that need to be synchronized when this command is used.

Per variable:

A unique network transmittable identifier that is two way mappable. From the variable we should be able to get the identifier, and from the identifier we should be able to get the variable, given the context of the enclosing struct or class, if any.
What properties or method to use to interpolate this variable, if it is interpolated at all.
If this variable is written with compression, and if so the number of bits, min, and max values
What culling function to call or properties to use, if any, to determine if a variable has changed enough to be worth sending.
What object this variable is a member of, if any. Used to determine if a send is not necessary because the object itself is culled.
Which system(s) are allowed to update this variable.

Per object:

A unique network transmittable identifier that is two way mappable. From the object we should be able to get the identifier and from the identifier we should be able to get the object.
What culling algorithm to call or properties to use, if any, to determine if an object is in the area of interest (to send object creation and updates). Also what algorithm to process, if any, to determine if the object is in the area of disinterest (to send object destruction).
The list of synchronized member variables.
Which system(s) are allowed to create and destroy this class.

Per remote system, on a server or peer:

Which objects this system has instantiated
The field of interest for this system (usually the camera position)
The last values for the variables that were sent to this system.

Per encoded string:

A unique identifier by which programmers or scripters can refer to this string.
A bitwise encoding of this identifier that can be transmitted over the network
Optionally, what fields each string expects to take, if format specification is used.

Synchronizing non-critical data

Some data isn't critical but would be beneficial to have synchronized among systems. This includes graphical effects such as randomized particles or playing the same animation key frame at the same time. While it is possible to synchronize this data with messages, it is not necessary because they don't affect gameplay. A better and somewhat well-known approach is to synchronize one or more integers. The integers can change value over time according to some predetermined, low bandwidth method. For example, sending the current and next value, where the current value will change to the next value at a predetermined time. These integers can be used as key frame indices, random number generator seeds, and other things.

Low level network system requirements

It's important to point out that I didn't cover the low level network system that this system relies on. At a minimum the low level network layer should support, in order of most important to least:

Required features:

Reliable and unreliable messaging
Ordering and sequencing with multiple channel support
Secure connections with strong packet encryption and tampering detection
A wide range of statistical reports
Automatic message to packet aggregation and splitting
Remote method invocation
Automatic flow control
Message priority level support
Native internet simulation capabilities (packetloss, spikes, lag, out of order packets)
Path MTU discovery with the ability to set this explicitly
Native bitstream support

Optional but helpful features:

Multithreading support
Packet level compression
IO completion ports (for Windows based games)
Cross platform

Conclusion

This article is intended to provide architectural level ideas that can be used to make your game-level network system programming easier and less prone to cheats. It doesn't provide specific implementations, which are game specific. For further information, I recommend searching the websites Gamasutra, Gamedev.net, Google, Flipcode. Excellent mailing lists include the GameProgrammer.com list, and GD-Algorithms-list. Low level network libraries that have most to all of the capabilities specified here include my own creation RakNet (free) and the Torque game engine (low cost).

______________________________________________________

About the Author(s)

Kevin Jenkins

Blogger

Kevin Jenkins currently works as Senior Technology Programmer at The Collective, but was formerly a network programmer for games including Kuma\War, Deadly Dozen 2, and Elite Warriors Vietnam. He is the author of RakNet, a popular open-source game networking API used in three commercially published games, and can be contacted via [email protected].

See more from Kevin Jenkins

Related Topics

Related Topics

Recent in More

Related Topics

Designing Secure, Flexible, and High Performance Game Network Architectures

About the Author(s)

Latest News

Trending

Cooking Games Spotlight: Deep Dives, Interviews, and More

Featured Blogs

Game Developer Essentials

Related Topics

Related Topics

Recent in More

Related Topics

<span class="ArticleBase-LargeTitle">Designing Secure, Flexible, and High Performance Game Network Architectures</span>Designing Secure, Flexible, and High Performance Game Network Architectures

About the Author(s)

Latest News

Trending

Cooking Games Spotlight: Deep Dives, Interviews, and More

Featured Blogs

Game Developer Essentials

Designing Secure, Flexible, and High Performance Game Network Architectures