Creating a massively multiplayer online game is game development’s equivalent of a moon shot. It’s expensive, technically difficult, and can take many years to complete. The rewards for success are so attractive, however, that is seems everyone is willing to give it a try. And yet, the technology required is not easy for newcomers to develop. Billing, patching, support, administration, client messaging, stable servers, data persistence, and server clusters all present potential technical stumbling blocks. Sensing this need, several companies are currently developing software libraries and products explicitly created to ease these hurdles. Though this type of middleware is often expensive and complex in its own right, the timesavings and risk reduction it promises to provide could be well worth it.
Two products that provide nearly complete MMOG solutions are Butterfly.net and Zona’s Terazona. Both of these products supply the skeleton of an MMOG, leaving only the skin of a graphics engine, a billing system and the guts of the actual game mechanics to be defined. Even the APIs where the missing bits connect is made clear.
Taken at the highest level, Butterfly.net and Terazona use the same solution strategy. The details are different enough that the environment and personality of the final developer will clearly suggest one over the other. Their highest-level strategy is not appropriate for all future MMOGs, however, and whether you would use butterfly.net or terazona depends on if your game fits into their mold. These products are best suited for a game with a fixed and well-defined landscape, a reasonable number of NPCs or monsters, and a clean and simple persistence model. Either would have been perfect for Mythic’s Dark Age of Camelot or Jaleco’s Fighter Ace. A brief explanation of how they work will make this clearer.
Common parts of the high level design
Messaging to Dispatcher or Gateway:
Both systems have a simple API to connect, validate account, get available game characters, and connect with a particular character. This uses a client/server messaging system, which is easily extensible, reliable, and simple to integrate into a game client. The client messages (e.g. direction or velocity change) are sent to a consistent point of contact on the server side. This Dispatcher or Gateway forwards these messages deeper into the server cluster. You can have multiple gateways for bandwidth management and reliability. These dispatchers know which server in the cluster is currently simulating your character and forwards the messages. This hides the division of the world from the game client.
Game Servers and their Landscape:
The developer divides up their world into fixed sections. The size of each section depends on how much action is expected to occur there. A good example would be a small village or a large valley. The actor/object positioning and landscape collision for each section is simulated on a single server. There can be multiple sections on a single server, but there cannot be multiple servers for a single section. These section servers are the final destination of the game client messages. If any NPCs or other players come into view or if a game specific message is generated, a message is sent back to the game client. The boundaries between these world locations are implemented in different ways for butterfly.net and terazona, but both make information about things on the other sides of these boundaries available to the game client. As far as the game client is concerned there are no boundaries.
These servers are divided into two separate halves: the generic landscape functionality, and the game specific mechanics. The developer writes the game specific mechanics to take care of the relatively simple rules like trading, grouping, targeting, etc. In addition, the developer must write a separate process that simulates monsters.
Game Specific Mechanics:
Messages from the client are forwarded to the game specific functionality in case special physics or movement rules are in play. Butterfly.net has you write this code in Python, which is interpreted by the built-in interpreter. Terazona asks that you link in a shared library to receive these callbacks. Either way, this rule checking code is mostly used to validate game state and client messages. Bu it isn’t always clear when you should put game mechanics in this library or in the NPC servers.
NPCs and Monsters:
Both systems pass the buck when it comes to monster AI. Terazona has NPC servers connect as a privileged game client. Butterfly.net has a separate NPC server for every section. In both cases how these separate servers are designed and coded is completely up to the developer. Of course, there is an API for receiving callbacks and sending messages to the landscape and the game clients. Interestingly, it is suggested that you should implement anything complex as an NPC. For instance, a temporary and invisible NPC might be used to calculate battle results or monster spawners.
The player data structure (or parts of it) is automatically persisted by both systems. They each suggest a database, but claim to be compliant with many different brands. Butterfly.net requires that you define all the state variables that are needed (also in the database), and it persists the player’s data in the clear when it can. Terazona saves out the player data as a block into the database. This saves you from having to define a table, but it also prevents you from performing any simple queries on the player data.
What Type of Games Work?
Despite some of the details of these high-level design assumptions, both systems claim to be very extensible. In fact, both are open to deals that include the source. So, it should be possible to shoehorn something unique into either of these frameworks. This increases risk and time of development, however, which could offset the original reason for using one of these middleware solutions in the first place.
Some games are a perfect fit for these types of systems. Most of the last generation of Massively Multiplayer RPG’s would have been able to use either of these. Their landscapes are stable, the population density of various areas is easily predicted, and the NPC load isn’t outrageous. A large arena based game like Jaleco’s Fighter Ace could have used a system like this with no problem. The landscape is simple, and the complex physics of flight would have been placed in the NPC server. Both systems allow the tuning of the performances characteristics to allow for a faster twitch style of game.
However, many of the coming generation of MMOG are starting to break the old mold. The Sims Online and Tabula Rasa both will have a very dynamic system of landscapes and server processes coming in and out of existence. Also, if someone decides to try a game with a random landscape system or an often modified and persisted landscape they might have difficulty.
Terazona server components are implemented almost entirely in Java. There are lots of components that make up the server suite; these include the Authentication server, the Administration Component, the NPC Servers, the GSS or Game State Servers, and others. Inter-component communication is realized by an implementation of the Java Message-oriented middleware standard provided by ICE Technology. The benefit of Java appears to be their portability. While we only evaluated the Windows version, they assure us the components run fine on several other platforms. Hopefully the downside isn’t performance.
The Game State Servers require the customer to implement a set of functions (their GSAPI – Game State API) in C/C++ and compile them into a DLL or shared object that the server will load and use for game state validation. I found the interface to the GSAPI clumsy; they give you a header file full of C function definitions and tell you to implement them. This is okay, but I would have preferred exporting a structure of function pointers, or implement a class with pure virtual members and exporting a factory.
In Terazona, sections are called regions, but they operate much like the above description. Probably the most interesting thing to note is the fail-over support inherent to their design. Since entities (players or NPCs generally) are ghosted to the necessary neighbor regions, when a region server fails, another server will re-instantiate the region, and attempt to get entity state from servers that had it ghosted. If that fails, it will go back to the database. GSS servers can be brought up and down while the game is running, without restarting other components. The fail-over support for the other server components is not evident in our evaluation kit, but they promise something exists for the other components as well. Terazona does not have its own implementation of a landscape, and does not do landscape validation for you. This is your responsibility to implement in the GSAPI.
The ‘NPC server as a trusted client’ model is used in Terazona. The downside of this model is sometimes the most complicated logic exists in the NPC servers, and their scalability is left up to the customer. The client-server protocol provided is based on TCP. This may be unacceptable for many applications, though they promise a reliable UDP implementation will be available. Interestingly, Zona’s business model is attractive to new MMOG developers. They only ask a small upfront fee in exchange for part of the subscription revenue. This has the extra benefit of increasing the likely support and library improvements you will receive.
Butterfly.net is more of a total solution than Terazona, which isn’t necessarily a good thing. In fact they are proponents of a utility model, which allows you to use their server clusters. As traffic goes up and down those servers are shared with other products that are also using the utility model. Although very cost efficient, this could be frightening to even the most experienced MMOG live-teams. Of course, they don’t prevent you from putting the parts together yourself on your own systems.
The rules management system that the server uses to validate game state uses functions written in Python. This could greatly simplify game authoring and helps with thread safety. It performs the same role as the shared library which connects to Terazona’s GSAPI. The Python interpreter is built into the game server, so performance shouldn’t be a problem. However, extremely complex systems should probably be written in the NPC servers.
The Butterfly.net system can have a separate NPC server for every landscape section (which they call locales). They should help with the distribution of CPU load if you have a NPC heavy game. The NPC server is entirely left to you, it is not written in Python. Human intervention is required should these NPC servers or Locale servers crash however. Butterfly.net has no automated server management that can restart processes or bring up another from a pool of servers. Human intervention is unfortunate, but if you are required to reboot a machine, the entire system will survive and rediscover the existence of the missing locale.
The suggested operating system is Linux with an Oracle database. This would be the platform of choice for many developers, so it shouldn’t be a problem. But if you were required to use a different operating system or database, they claim that much of the system can be ported. Also, they have a nice example of how to connect to NDL’s Netimmerse graphic engine, should you wish to use a third party solution for the client as well. Tools for porting terrain models from Maya or 3D Max into Butterfly.net’s Landscape quad-tree are also provided.
What would we prefer?
Although total solutions have their place, MMOG’s might be too young of a genre to support this strategy. There is definitely a use for middleware however. A more open ended toolbox approach is proving useful for graphics engines such as Criterion’s Renderware, and I believe that would be a useful model to emulate on the server-side as well. This would mean a large collection of smaller functional libraries, separated by a well-documented API. Libraries of functions for messaging, persistence, cluster-control, and terrain quad-trees could be combined into whatever system a developer might need. The development would require more work, but unique server-side solutions could be created and customers would feel more comfortable with the end-result.
Also, The separation between rules systems (GSAPI) and NPCs is troubling and confusing. Although the placement of functionality into one over the other might not be arbitrary, it might become hard to support and update as the application grows in complexity. A combination of these two systems into a single developer maintained process would be much cleaner.
Lastly, neither system supports the concept of an automated generic pool of servers. This would entail every server being able to perform any function. Should an existing server fail or load becomes too high a cluster manager would automatically bring one of these generic servers into play. This would create a high amount of reliability, and would require very little human intervention.
Its very exciting that the Massively Multiplayer market is growing enough to support the creation of these types of solutions. MMOGs are becoming too complex and risky to develop for very much innovation to occur, and we are all happier if systems like Terazona and Butterfly.net help alleviate that. They face a challenge, however, in that the game industry treats third party products and code reuse pretty poorly. The real test is whether these products are useful enough for people to overcome their superstitions. Neither have a well-known MMOG that has used their technology yet. Even so, both are strong contenders.