Despite the simple appearance, there’s a lot of work that went into the creation of KungFu Kickball: from machine-learned AI to arcade cabinet integration. The biggest challenge, though, was definitely the netcode. It went through many iterations before I landed on the current setup. So I figured it might be an interesting thing to share. I’m writing this article partially just for myself, to document my journey with the netcode. But I also hope it helps some folks shorten their own multiplayer adventure.
KungFu Kickball was originally conceived as a couch co-op game. It’s a fighting-sports game heavily influenced by other games like Smash Bros, Rocket League, and BaraBariBall. I wanted to make something that was fun to show at conventions and would also work well in a college dorm room setting. However, early on I realized the game needed to be able to be played online for it to have any chance of commercial success. I knew it would be a major undertaking, but I thought since I was starting from the ground up and my game had relatively few moving pieces, it wouldn’t be too bad. I was wrong.
To give a little background on myself, I started this project knowing very little about netcode. I had been a web developer for many years, so I knew the basics about how the internet works, but not much about how games generally use it. What I knew is I wanted it to be good enough that players wouldn’t complain about the netcode (a high bar). I also wanted it to be fair (wouldn’t give one player a lag advantage) and I wanted it to be able to work between platforms. Crossplay is pretty expected these days. So given that I was working in Unity, the first thing I did was look for a popular plugin (Unity had deprecated their own multiplayer library, UNet). That’s how my journey began with Photon PUN.
1. The Popular Pick
Photon is probably the most well known multiplayer plugin provider on the Unity asset store. And for a good reason. They’ve been in business for a long time and the pricing on most of their products is pretty reasonable. PUN is presented as the most user friendly offering and actually replicates a lot of the old Unity UNet functionality. However, after just playing with the demo, I realized it wasn’t going to work for me.
The primary issue stems from PUN’s handling of player authority. By default each player (client) has authority over their own character. This means that when figuring out where all the characters are in a scene, each client defers to that character’s owner. This has the advantage of each player feeling like their own character is very responsive. But things get tricky when two characters are running into each other. Now each client could believe that their character should be in the same spot, and you have the physics engine battling the netcode synchronization which ends up looking like a mess. Since my game needed player characters to be kicking and bouncing off each other in understandable ways, this setup just wasn’t going to work for me.
PUN does allow modification of this default behavior. So one possible solution to the “player contact” issue would be to make the “host” the authority on all characters. The issue with that is it would lead to unfair lag times, which is one of the big things I wanted to avoid. Basically the host would have no lag, and everyone connected to them would have lag of 2 times their ping to the host. The non-host players would need to send their inputs to the host, and then the host would report back how their character moved. So that’s a full back and forth trip before the player sees their character move. And since I wanted to give my game a chance at having a real competitive scene, I needed something more fair. So back to the drawing board.
2. Lock, Stock and Step
Researching how most fighting games handle netcode, I learned about something called Lockstep. This seemed to be a pretty well respected architecture for peer-to-peer multiplayer games. And I wanted to stay peer-to-peer in order to keep my server costs low.
So what is lockstep? Or often referred to as “deterministic lockstep” or combined with “lockstep and rollback”? Well, it’s actually one of the oldest forms of netcode architecture and it was what the original Doom used for LAN multiplayer. The basic idea behind lockstep is that rather than game state info, the only data clients send to each other are their own controller inputs. Then, assuming the game is completely deterministic, it should play out exactly the same for everyone. And lag should be exactly the same for everyone, since no player is the “authority” on the state of any game objects.
For a much more detailed explanation of lockstep, I highly recommend the “Gaffer on Games” blog. He has many great articles on every aspects of programming netcode, but this one on the lockstep method was really helpful for me to understand it:
Now that I knew the framework I wanted, I started looking for pre-built solutions, since I didn’t want to reinvent the wheel. The most well known (and respected) lockstep framework is something called GGPO (https://www.ggpo.net/). ; It’s now open source, but back when I started this journey it wasn’t and I didn’t have the funds to pay for any sort of licensing fee. So I found another lockstep solution on the Unity Asset Store with a reasonably priced payment plan. And after starting to implement it, things were working well! It still needed some adjusting, but character collisions felt good and the controls felt snappy. However, after getting about half way through implementing it, the developer dropped all support for the plugin. Instead they introduced a new lockstep-style plugin with a high recurring monthly fee that was beyond my budget.
At that point I felt like I had a pretty good grasp of lockstep and how I wanted a solution to work. So I figured, why not give it a shot. Time to reinvent the wheel.
3. Rolling My Own
Although I had to start again from scratch, I had learned a bunch just implementing the previous lockstep library. I wanted to keep a similar interface, I just needed to replace everything the library did. While doing research I also found some great GDC talks that went over how some other game companies dealt with these issues.
This talk about lockstep implemented in Mortal Kombat and Injustice 2 also just has a really great overview of lockstep concepts:
This talk about Rocket League netcode also goes into why physics stuff over the network is such a pain: https://www.youtube.com/watch?v=ueEmiDM94IE&t=1413s
The rest of this article is going to be getting into the technical weeds of how I implemented my own lockstep solution.
To start off with, the biggest challenge with lockstep is making sure the game is completely deterministic. That means that given the same conditions, (same start state with the same controller input) the outcome will always be the same. And as I found out, the very basic variable type ‘float’, which is used all over the place in most game engines, isn’t deterministic across different computer chips. Or at least math involving floats isn’t deterministic. Different computer processors can implement float math slightly differently from each other, which basically has a butterfly effect when it comes to lockstep and games can get out of sync. The dreaded ‘desync.’ This is one of the downsides of lockstep. Once a game is desynced, it’s next to impossible to get it to sync up again.
To combat this, many games use a number type called Fixed Point. FixedPoint is a way of storing real numbers as ‘int’s. It basically uses the first set of bits to represent the numbers before the decimal point, and the second set of bits to represent the numbers after the decimal point. They are more limited in range than floats, but the advantage is all calculations with them are just int calculations under the hood, which is deterministic across hardware. And luckily this was one wheel I didn’t need to reinvent. I found a great open source C# implementation of a fixed point type on github https://github.com/asik/FixedMath.Net
Next thing to tackle was the physics engine. I couldn’t use the built in Unity physics, since it uses floats all over the place. Plus, I wanted to be able to “roll back” the physics state, which would have been really complicated (maybe impossible?) with Unity physics. I’ll get more into rollback and what that is later. So I started looking into physics engines. I didn’t need anything too fancy. My game is all 2D and there’s nothing crazy like joints or ropes or anything. And luckily it was github to the rescue again. I found this open source project called VolatilePhysics which was already made with some networking in mind https://github.com/ashoulson/VolatilePhysics. The engine does use floats, but a simple find-and-replace of ‘float’ to ‘FP’ actually worked! So now the base for my lockstep engine was ready for me to build on.
3.a Some internals
I’m not going to go through every struct and class of my engine, but there are a few interesting pieces worth reviewing. Like my ControllerState struct. This is the base that’s used to send frame inputs between players:
Since inputs are constantly flying back and forth, I wanted to make this struct as small as possible. And I was able to get it down to three bytes. The first byte holds the button states of the controller. Since the game uses less than 8 buttons, I’m able map each of the bits in that byte to a separate button. 1 = pressed, 0 = not pressed
Then for the stick position, I realized I don’t actually need that much accuracy. I know the X and Y values are always going to be between -1 and 1, so I can break that into 256 values per axis, which is more than enough. So I compress the X axis and the Y axis of the stick into a byte each.
The other advantage of keeping the input size to a minimum is that it means I can send a number of input frames at once without growing the packet size too much. This allowed me to make use of UDP instead of TCP connections. As a quick explainer, TCP and UDP are network protocols for sending messages over the internet with slightly different properties. TCP is reliable but often slower, while UDP is unreliable but usually faster. Every message sent with TCP waits for an acknowledgment from the recipient, but UDP is just fire-and-forget. The disadvantage with UDP is that messages could be lost or received out of order, which isn’t an issue with TCP.
For many games, the speed of TCP is fast enough. But for fast action and fighting games, UDP helps squeeze out a little bit more responsiveness. So I knew I wanted to use UDP. To get around the unreliability, I send the frame number of the input with every input state. And I also send past frames as well. While sending out the inputs for one client, I also include the last frame it received from every other client. So each client can tailor the number of frames they send out to account for the last frame every other client received. To illustrate, here’s an example of what a message from player 4 would look like going out to players 1, 2 and 3:
Last frame received from player 1:
Last frame received from player 2:
Last frame received from player 3:
My Frame Data:
3.b Rolling Back
One of the best ways to hide lag is with a technique called Rollback. The idea behind rollback netcode is to allow the game to progress even if a client is temporarily missing some of the necessary information from the server (or other players). It does this using a “guess” at what that information will be. Then, when the real information does come in, the game rolls back in time to a previous verified state and fast forwards back to the current time using that real info. In the case of lockstep, the information that’s required is the inputs from all of the players.
The trick with this technique is to make sure the game keeps a recent history of everything that affects gameplay so that the entire state of the game can be reverted to an earlier state. To make this game state recording easier to implement, I decided to use C# attributes and reflection.
Making a new attribute in C# is pretty straight forward. All I needed was a way to mark fields and properties, so my attribute is empty:
I used this to mark every field or property which is important to the state of the game. Here’s an example from my PlayerIdleState class:
Then when the game starts, I use reflection to loop through all the elements of rollback-able classes and register the ones that have the AddTracking tag for rollback. This is a clip from my RollbackHistory class:
In general, reflection is kind of slow. But since I’m just doing this once at the start of the game, speed isn’t a big concern. Also, every object in my game is created when the match starts, and after that nothing new is created or destroyed.
I gather all of these up to be stored in GameStateRecord objects. My GameStateRecords are generic and expandable. Internally, they just hold lists of every possible type to track:
And I create every GameStateRecord I’ll need at the start of the game too, since I only need a recent history of about 10 frames. I just reuse these record objects so that I don’t create garbage during gameplay.
Actually registering these fields and properties in GameStateRecords is done through a combination of events and lambda expressions. Back to the RollbackHistory class:
I first reserve an index for the field, or what I label the fieldId here. This is the position in the appropriate GameStateRecord list which I’m going to be storing and retrieving this field from. Then I create on the spot getters and setters for the field, which I connect to the RecordState and RollbackToState events:
This made it easy to add and remove elements that needed to be able to roll back without having to modify my GameStateRecord each time. Once I had lists for every type I’d want to keep track of, it was a simple matter of adding a new [AddTracking] tag.
Even with this easy way to add tracking to elements, I’d say one of the most difficult parts of the process was tracking down everything I needed to tag. There was a lot of iteration time of playing an online match against myself until I triggered a desync, and then scouring through my code to see what field I forgot to track that was triggering it.
The final thing I’ll mention about my rollback code is the guess-ahead strategy. Obviously if you are advancing frames of a game before having the real input from other players, you’re going to want to try to guess what their inputs will be. So if you guess correctly, the rollback will be completely seamless. I figured there must be some sort of fancy learning algorithms that are used which can predict the player input given how they’ve played previously. And I knew GGPO was one of the best rollback solutions out there, but I couldn’t find anything about what it was doing for input prediction. That was until I found an old forum post which mentioned an interview with the creator of GGPO in an issue of Gamasutra from back when Gamasutra was an actual physically printed magazine.
After some more scouring of the internet, I found a scanned image of the article I was looking for, and in it, the creator revealed the secret strategy GGPO uses for input prediction which was… sticky inputs. Basically it just uses the input from the previous frame (for all non local players). Which in retrospect, makes sense. Frames go by pretty quickly. And the odds that a player’s inputs will change between any two consecutive frames is actually pretty low. Sometimes simpler is better.
Around the time I was wrapping up my netcode, I was able to land a publishing deal. This opened up some new possibilities for me. My publisher (Blowfish Studios) was able to handle server management, which allowed me to create something better than just the simple relay server logic I was using.
One of the big issues that can happen with lockstep is slowdown. Sometimes even rollback isn’t enough to hide the latency between players. And when the game runs out of rollback frames, it needs to stop and wait for the inputs to arrive. This issue is exacerbated when you have more than two players. If one player has a poor connection, that can slow down the game for everyone else. So I wanted to see if I could do something to get around this issue. Which is how I came up with the idea of server authoritative lockstep.
I’m sure I’m not the first one to come up with this type of architecture, but I didn’t find anything online about what the name might be. So I’m labeling it Server Authoritative Lockstep. The basic idea is that the server acts as a steady clock and the authority for a frame’s inputs, but otherwise it works pretty similar to a standard lockstep setup. The main benefit is that all of a sudden you have an impartial judge that doesn’t need to wait for every player’s input in order to advance the frame. The server just broadcasts all the input it received over the last frame. And if a player’s input didn’t arrive in time, the server just uses their input from the last frame. Also because the server program stays simple and still doesn’t simulate the whole game, it’s pretty cheap to run.
This had the added benefit of simplifying my architecture a bit. The clients now just send a steady stream of inputs to the server and don’t worry about specifying a frame. The server places the inputs in the upcoming frame and sends the whole frame out when it’s time. Clients still have to tell the server what the last frame they received is, so the server can send multiple frames if it needs to. But it’s a lot cleaner than the four way juggling act I was doing before.
On top of that, I’m able to hide slightly more lag by pushing the clients a few frames ahead of the last server frame they received (dynamically based on their ping). So the code is actually rolling back and fast forwarding every frame. Luckily there aren’t too many moving pieces in my game, so I’m able to do that all pretty quickly.
The only downside to this setup is that introducing a server into the mix can actually make lag worse. For example, if the server is in California and two players in Italy are trying to play a match against each other, the response time is going to be much worse than if they were just playing peer-to-peer. But having multiple servers can split up the player base. So there was one final issue to solve.
5. Over The Edge
The final piece to my netcode puzzle was finding out about a service called Edgegap. I don’t mean for this article to sound like an advertisement for them or anything, but their service ended up being exactly what I needed for my situation. They basically act like an infrastructure to give you servers on demand with super-fast spin up times.
The idea is that, using docker, they’re able to provide a server on a per-match basis. When an online match is started, I make an REST api call to their service with the ip addresses of the players and they load the docker image of my server application onto a server they have that’s in the best location for those players. And this all happens within 8 seconds. So those two Italian players would most likely be assigned a server in Italy (or at least somewhere close by in Europe).
This allows me to keep the number of servers I’m actively paying for pretty low, while at the same time having a huge network to draw from. And best of all, the integration was pretty straight forward. I just needed to turn my server application into a docker image and upload it to the Edgegap platform. This also allowed me to continue using a single matchmaking server which my publisher was managing so it didn’t split up my player base. But now this matchmaker just handles the API call to Edgegap to get a new server and then sends players over when the match is ready to start.
6. If I Could Turn Back Time
There are a few things I might have done differently if I had to do it again. Obviously now that GGPO is open source, I would have started with that. But I’m not even sure how much time that would have saved. GGPO is mainly a networking layer, and a lot of the difficulty was just making sure the game can be rolled back perfectly without desyncing. Which is all specific to my game, and there’s really no shortcut for that. Maybe it would have been easier to just go with standard authoritative game servers. But then I would have had to deal with game state serialization and syncing, which would have been almost as much of a pain as making it rollback enabled. So in conclusion, there are no easy solutions if you want good netcode for your game.
In retrospect, if I had known how much effort this was going to take, I probably wouldn’t have tried to make an online multiplayer game as a one-person game development studio. So I’m kind of glad I didn’t really know what I was getting into, because I am pretty happy I was able to complete KungFu Kickball. The game is finally out now on every major platform with cross-play. I’m writing this article before the release though, so hopefully no major issues have popped up that might invalidate this whole thing. But all in all, I learned a ton and I’m pretty proud I got it all working, even if it did take a while. However I will say that the next game I make definitely will NOT have online multiplayer.