informa
22 min read
Features

Porting Games to Mac OS X

With the release of Apple's new Mach-based operating system, OS X, the Macintosh gains a new path to high-performance gaming. Tim Wood from the Omni Group, veteran of many OS X game ports, shows how to design your new title to ensure compatibility with this operating system or port your existing game over to take advantage of the Macintosh's untapped market.

With the release of Mac OS X, the Macintosh platform gains a new path to easy-to-use and high-performance gaming. This article will address how you can easily port your current game over to the Mac and the APIs in Mac OS X that you can use to do so. Many of the issues involved with porting to a new operating system are common to porting to any new OS, not just the Mac.

I'll primarily be addressing this subject from my experiences in porting games such as Quake 3: Arena, Star Trek: Voyager — Elite Force, Oni, and American McGee's Alice. All of these titles obviously belong to a similar style, but the techniques I'll describe in this article are applicable to any game. We at Omni have been able to port well-structured games successfully in a matter of days, and sometimes hours, due to the productivity and ease-of-use advancements in OS X.

Why Port to the Mac?

The first reason is obviously economics. Sure, writing games is one of the most exciting and most challenging jobs around, but if you aren't generating income, you are either independently wealthy, or you won't be doing it for very long. Every developer should strive to write portable and modular code as a matter of course. The benefits of doing this are many and diverse. One of the benefits is being able to move your code easily to a new platform and attract an audience that you wouldn't have attracted otherwise. If your game is written correctly from the beginning, a port to the Mac will generate much more money than the cost of porting it. If you don't do the port, you might as well toss money in the trash.

wood_06.jpg

Screenshot from Star Trek: Voyager — Elite Force.

Revenue in the Mac market will certainly not be as high as in the PC or console markets, but neither are costs. Advertising does not cost millions of dollars in the Mac market. Due to the high level of community, the word-of-mouth advertising, and possibly piggybacking on your PC marketing, if you have a simultaneous release date it can yield very good market penetration. The Mac market also doesn't demand that you produce a game with a $3 million budget. If you are looking at original development on the Mac, you can build games very cheaply that will be well received (and perhaps focus more on gameplay rather than having to spend time on all the latest graphics effects just to get on the shelf). Simple and well-designed titles are possible on the Mac.

The Mac market has a longer shelf life for titles than the PC market, and there is less overall competition. Thus, while it is very easy to lose money on a PC or console title, it takes much more effort to do so on a Mac title.

Also, if you license your PC publication rights, you can often hold back the Mac rights. Many PC publishers will not see the economy of scale on the Mac that they need in order for them to turn a profit. By doing so, you can either publish on the Mac yourself or find a publisher that specializes in the Mac market and knows how to make money there. This will give your development house additional income beyond the advances and royalties you get from your PC publisher.

There are other, less obviously money-grubbing reasons to port to the Mac. If you plan on licensing or reusing the engine that you are using for your current title, the work in making your engine run on the Mac can be amortized over multiple titles, and it can generate more licensing interest.

Finally, moving your code to another platform can help uncover many latent bugs in your code. In this case, the extra effort involved in supporting multiple platforms can actually reduce the amount of work at the tail end of a project by ensuring that the base you are building your game on is as stable as possible.

Planning Your Game

As with any complicated task, large gains in productivity can be had if you plan your effort before embarking on it. The first step in planning a project is deciding where you want to end up after all your effort has been expended. This applies to both the features you want in your game and the platforms on which it will run. The earlier you decide on your supported platforms, the easier it will be to achieve your goals.

The decision of which features your game will include is related to the platforms you will support. For example, if you are planning on supporting wireless gaming, you probably won't be using Open GL for at least a couple of years (beyond that, who knows?). Likewise, if you are going to write games that are going to run on the Mac, you need to pick foundation technologies that are available there. This excludes proprietary technologies like DirectThis and DirectThat (and Mac proprietary technologies such as Core Graphics, Cocoa, Carbon, and so on if you are going to run on Windows).

There are many well-known software techniques for multi-platform development. I'll assume you are familiar with these for the purpose of this article, and I'll focus exclusively on Mac OS X-specific techniques. Some of the arrows in your quiver for multi-platform development should include separating code using ifdefs based on the platform, using custom data types to steer clear of standard type system dependencies, avoiding depending on bitfield order, steering clear of compiler- and linker-specific behaviors, and of course using a good source code control system and open APIs.


Mac OS X Technology

Mac OS X has two primary high-level toolboxes, Cocoa and Carbon. Mac OS X also has two object file formats, Mach-O and CFM. There are some choices to be considered when choosing between these. In this article, I'll be talking about the Cocoa/Mach-O approach. This choice is primarily due to the fact that fewer lines of code are necessary to accomplish similar functionality in Cocoa, and Cocoa requires Mach-O.

Cocoa is Apple's advanced object-oriented application toolkit, which is based on the technology it acquired from Next. Carbon is a distillation of the classic Mac OS toolbox APIs that removes a bunch of the less commonly used functions which were not easily implemented in the new world. This includes removing things such as direct access to the hardware, completely obsolete APIs, and so on. The remaining APIs have been modified to work in terms of the new underlying OS. So, a Carbon application can run on both OS 9 and OS X (as long as it doesn't make use of any new OS X services).

The foundation of Mac OS X is the Mach kernel. Mach provides the hardware abstraction and lowest-level OS services. This includes interprocess communication, protected virtual memory, threads, symmetric multi-processing, and driver services. The BSD/POSIX layer sits on top of Mach (so a BSD process is really a Mach task with a little extra goo, and a POSIX thread is really a Mach thread with some of its own goo). All of the API sets are accessible from a user program written using Carbon or Cocoa, but the vast majority of users will be able simply to use the high-level APIs or maybe occasionally use the intermediate BSD APIs. Only in special cases is it necessary to access the Mach API directly, so most of the time Mach sits in the background, providing a rock-solid OS infrastructure. As an example, Mach provides an API for pausing a thread, getting or setting its registers, and then allowing it to continue. Programs don't need to typically do this, but if you were writing a debugger, this would be very important.

wood_02.jpg

Screenshot from Quake 3: Arena.

Platform-Specific APIs

There are a few platform-specific APIs that most games depend on. These are APIs for performing the following tasks:

  • System functions (file and network access, memory management, threading, code loading and unloading)
  • Display management
  • 3D graphics rendering
  • Music and sound effect playback
  • Reading input devices.

I'll address each of these functional groups in turn.

System functions
Mac OS X has a BSD 4.4 API layer as part of its Mach-based kernel. Apple has stated that their goal is to be POSIX-compliant. This means that a large portion of the platform-specific APIs can be addressed via both BSD and POSIX APIs.

The Windows stdio interface is very similar to the BSD functionality from which it was copied (except for Windows' strange notion of binary versus text files). The stdio API can be used for all file I/O, with the option of accessing the Mach API for memory-mapping files.

Likewise, the BSD sockets API served as the template for the Windows version. There are some minor differences here (select() versus WaitForSingleEvent() and the list of supported socket options), but nothing terribly surprising.

Unlike on many systems, the standard memory allocation package on Mac OS X performs very well. Still, for portability with other platforms you may choose simply to use malloc to allocate large chunks of memory for your internal memory allocator.

For threading, Mac OS X uses the POSIX threading library (pthreads). The implementation of this library isn't 100 percent complete, but the items which aren't implemented are more esoteric. If your game uses threads at all, you likely only want to create threads, mutexes, and conditions -- this portion of the API works fine. If you do want to do anything more interesting, there is the option to use the underlying Mach thread APIs (each pthread corresponds to a Mach thread).

Mac OS X uses a dynamic linker called "dyld," which handles both launch-time linking of shared libraries and run-time linking of code modules. While it is possible to call dyld directly for your code-loading needs, it is probably easier to use the "dl" API defined in Linux, Solaris, and other Unix platforms. A wrapper for dyld that provides the dl interface can be found as part of the open source Darwin kernel that resides underneath Mac OS X (see For More Information).

The input to the dl wrapper API should be a Mach-O "bundle" file (as opposed to a dynamic library, or "dylib"). Using Project Builder, the IDE that ships with Mac OS X, or whichever IDE you prefer (Codewarrior, for example), you can easily build a bundle file. Bundles are typically file wrappers, which simply means that they are directories that contain a variety of resources, one of which is code to be loaded. The path to this code is what should be passed to the dl API functions.

It is also possible to load CFM libraries into Mach-O processes using the Carbon Code Fragment Manager APIs. You might choose to do this if you want to use a toolkit that is only available in CFM format for the Mac (for example, the Bink video library).


Display Management: Core Graphics
At the heart of Apple's new Quartz rendering system is the Core Graphics framework. Core Graphics implements a powerful PDF-based imaging model and also supplies primitives for accessing and configuring the display hardware.

Core Graphics can easily support multiple displays, so the first thing to do is choose the display or displays and your preferred display mode(s). Each mode is a dictionary of key/value pairs which can be queried easily. The kCGDisplayIOFlags key returns a mask with various interesting bits. By far the most useful is the kDisplayModeStretchedFlag. On a Cinema Display (or other wide-aspect-ratio monitor), there may be multiple versions of the same mode, one with a square pixel aspect ratio and one that is nonsquare, taking advantage of the full width of the screen. Typically, you will want to pick the unstretched mode, but if your graphics technology allows for it, you could pick the stretched mode and apply a viewport transformation that accounts for the nonsquare pixel aspect ratio (and thus you get to use the entire viewable area of the monitor).

The Core Graphics framework also allows control over the cursor. In addition to being able to hide/show and move the cursor, Core Graphics allows you to disassociate the mouse and cursor. This means that when the user moves the mouse, you will receive mouse events, but the cursor on the screen will not change position. This is useful for automatic demonstrations of an application, but it is also useful in full-screen Open GL applications. If you do not pin the mouse down while in full-screen mode, even though there is a window shielding the entire display, if the user moves the mouse high enough to hit the menu, the menu will start grabbing the mouse events. The easiest way to avoid this is to pin the mouse in the center of the screen while you are in full-screen mode (see Listing 1).

You will also want to allow the user to control the gamma setting inside the game. Core Graphics provides several functions for setting the gamma curve. Some of these functions take tables of data, and some of them take function descriptions (see Listing 2). When your game is about to exit (or if you have a "reset to defaults" option in your configuration screen), you will want to restore the gamma curve to that specified in the user's ColorSync settings.

3D Graphics Rendering: Open GL
The clear choice for 3D on Mac OS X is Open GL. As with every other platform, the Mac OS X version of Open GL adds its own API for creating a GL context and binding it to a drawing surface. Mac OS X actually provides three such APIs. Two of these correspond directly to the two high-level UI toolkits. For Cocoa, there is NS Open GL, while Carbon has AGL. Both of these are very thin layers on top of Core GL (CGL).

The first stage to creating an Open GL context is to decide whether you are going to be in full-screen mode or not. If you are, then you need to capture the display and change its mode setting. Capturing the display prevents any other applications from accessing the display and, very importantly, prevents other running applications from being notified of a screen geometry change. If you don't lock the display, other applications will find out about the geometry change and move their windows around, annoying the player.

Next, you need to create a pixel format object that describes the list of attributes that the Open GL context must have. This includes the number of color and depth bits, whether the context should support full-screen usage, and so on. In Mac OS X, the Open GL context must always have the same color depth as the frame buffer. Once we have the pixel format, we simply use it to create the Open GL context and then discard it (see Listing 3).

We can set and query a wide variety of parameters on the context. One useful example is setting whether buffer flushes are synchronized to the vertical refresh.

Before we can draw anything into the context, we naturally need to make it the current context. We also need to be able to clear the current context, display the context, and occasionally we need to find the current context. All of these operations are simple one-liners.

Music and Sound Effect Playback: Core Audio and Sound Manager
Mac OS X includes several APIs for making noise. Core Audio is the lowest-layer audio API. Core Audio uses a callback to provide sound samples. This callback is invoked in a different thread, so it must be at least minimally thread-safe. All samples are in floating-point format, making it easier to perform mixing. The callback receives several timestamps, two of which are valid depending upon whether the callback is being invoked to play or record audio samples. One of the timestamps is the current time; in the case of playback, the other timestamp is the time at which the samples currently being requested will actually be heard. This allows for fine-grain synchronization between what you see on the screen and what you hear.

Setting up Core Audio is very simple. We simply get the device on which we want to play audio, configure the buffer size, provide a callback, and tell the device to start playing (see Listing 4).

The audio API a level above Core Audio is the Carbon Sound Manager. Since Sound Manager is built on top of Core Audio, it will have slightly higher overhead than using Core Audio directly, although if you are using short samples internally instead of floating-point samples, you may be better off using Sound Manager. Unlike Core Audio, you do not need to provide a callback function, but can instead just send play commands whenever appropriate. Sound Manager does provide a command that will call a callback function, so you can issue another play command and request another callback when that buffer is finished.

In addition to the relatively simple Core Audio and Sound Manager APIs, Mac OS X provides the Quick Time API. Quick Time is extremely powerful and, thus, rather more complicated than either of the two lower-level APIs. Quick Time provides facilities for playing audio, video, Flash, PDF, and other media types. Of particular interest for game developers are the audio decompression capabilities of Quick Time. The code to do this is too long to present here, but is available with the rest of the example programs developed for this article (see For More Information).

Reading Input Devices
Keyboard and mouse input can be implemented through a combination of the normal Cocoa event mechanism and calls to Core Graphics. Support for other input device types (such as joysticks) can reportedly be accomplished via HID Manager, but as this API is not documented yet, I won't address it here (although check the Omni Group web site for updates later).

Keyboard up and down events are just normal Cocoa event objects. These objects carry a string of characters in the event, encoded as Unicode characters. Function keys and other special keys such as Help and Home are defined in the vendor-specific Unicode range. Modifier keys such as Shift, Control, and Command do not transmit keyboard up or down events, since there are no Unicode characters for these keys. There is a "flags changed" event that is sent when the state of these keys changes.

Mouse button events are transmitted in two different ways under Mac OS X. Left, right, and "other" buttons have individual up and down events, but if you want to handle a larger number of buttons, it is easiest just to ignore these events. Instead, you can use the "system defined" event which is sent each time a mouse button changes state -- part of the data payload for this event type is a 32-bit mask of the current button state.

When the mouse moves, events are sent to your application. These events contain absolute mouse position information, clipped to the bounds of whatever screen the mouse resides on. The data in these events can be ignored, with the event just serving as a notification that you should call Core Graphics' CGGetLastMouseDelta() function. See Listing 5 for an example event-handling loop for mouse events.

There are a few common problems you might run into with input on Mac OS X:

  • The keyboard repeat and mouse-scaling settings are not automatically restored when your application terminates.
  • The "other" mouse button events have been added since the public beta but are not yet documented.
  • It is best to avoid assuming that your engine will be able to poll the state of a device button, since some platforms only have event interfaces.

Also, if you are creating a windowed application (not full-screen), you will need to create your own window object in which to place your game view. If you want to receive mouse movement events in this case, you need to request them explicitly (this is not necessary in full-screen mode).


Power PC Specifics

The code snippets in the example programs are sufficient to build a game that runs on the Mac, but in order to make a game that runs as well as possible, there are a few Power PC-specific issues that you may need to address, depending upon the architecture of your game.

Pitfalls
The Macintosh does not have the memory bandwidth that Intel boxes have. This is less true on the newer machines, but if you are targeting older iMacs, you will need to be aware of this. There are things that you can do to help avoid this problem. First, stay away from back-to-back load and store operations. Instead, load several values, operate on them, and then store them. The Power PC chip has a huge register file compared to Pentiums. You can avoid a lot of memory operations simply by putting more values in registers. The Power PC also provides a set of cache control instructions that allow you to preload cache lines, flush them, or zero out entire cache lines (much faster than doing it yourself, since you avoid the read to load the cache line).

Converting between floating-point and integer formats is expensive on the Power PC. There are two reasons for this. First, since the Power PC is RISC, the floating-point and integer units are only connected via memory through the load and store unit. Additionally, the Power PC (prior to the G4 Altivec instruction set) does not have architecture-level support for converting from integers to floats. Casting between float and integer is never free on any architecture, but it is definitely more expensive on the Power PC. You can often get large performance increases by eliminating needless casting back and forth between int and float.

If your game engine is in C++, you will not be able to mix Objective-C code snippets as listed into the same files as your core C++ code. This isn't a huge problem, since all the platform-specific code should be isolated in its own files anyway. Currently, the simplest way to call between C++ and Objective-C is to use a vanilla C interface. If you design your platform support library interface in pure C, you won't even notice this problem. Apple is devoting engineering resources to this issue, however, and was scheduled to report on their progress at the 2001 Apple Worldwide Developers Conference in May.

wood_07.jpg

Screenshot from Oni.

Optimizations
The Power PC has a few instructions that deserve to be pointed out for possible optimizations.

If your engine uses the square root math library function, you might be able to use the frsqrte instruction. This instruction computes an estimate of the inverse of the square root. Depending upon your precision needs, you can use multiple Newton-Raphson refinement steps to extend the precision of the result. The frsqrte instruction can in practice be up to 16 times faster than 1.0 / sqrt(x). In addition to using this instruction for the reciprocal square root, you can also use it to compute a normal square root by simply multiplying its result by the original value, since x / sqrt(x) = sqrt(x).

The Power PC provides an fsel instruction for performing simple if/else assignments. This can eliminate branches in inner loops, which not only reduces the total number of instructions issued, but also frees up branch prediction slots and eliminates the possibility of an incorrectly predicted branch.

Another interesting group of instructions is the lwbrx family (load word byte reversed indexed). This family of instructions allows you to load or store two- and four-byte values and also perform endian swapping. This is much faster than loading the value and then performing bitwise operations in order to swap the value around manually.

Performance-Monitoring Tools
Mac OS X ships with a full set of developer tools. Included in this are several performance monitoring tools. The Sampler application (and the "sampler" command line tool) will periodically stop all the threads in your application, record their stacks, and then let them continue. This provides you with a tree of wall clock time spent, easily allowing you to find the portions of your program that are using the most time. You can also invert the tree, putting the leaves at the root, allowing you to find small leaf routines that are taking a large amount of time.

Omni also provides the OmniTimer framework (see For More Information). This allows you to insert instrumentation calls into your application at key points (typically determined by running Sampler) in order to get very high precision timings. OmniTimer uses the Power PC TBR (time base register) in order to minimize the overhead of collecting the timestamps.

Advanced topics
The Power PC G4 ships with what is considered by many to be the best SIMD instruction set in a consumer CPU. With the right type of task, Altivec can provide huge performance gains. It is also possible to store floating-point (X, Y, Z, W) vector data in single Altivec registers and operate on those registers using macros or inline functions. Care must be taken to keep the values in the vector registers to yield performance gains. The jury is still out on the feasibility of this approach, but it is worth considering.

Mac OS X provides full symmetric multi-processing. If you have the right sort of tasks (that is, very few synchronization points and low data flow), you can break them up into separate threads and Mac OS X will automatically schedule them to different CPUs (if available).

Mac Attack

In the past couple of years, Apple has increased its focus on gaming, and it shows. The Macintosh is now a great gaming platform and only looks to improve in the coming years. By porting to the Mac you can experience increased portability and robustness for all platforms.

Even better, by adding the Mac enthusiasts to your customer base, you can increase your revenue stream while continuing to produce excellent games.

 

Latest Jobs

Treyarch

Playa Vista, California
6.20.22
Audio Engineer

Digital Extremes

London, Ontario, Canada
6.20.22
Communications Director

High Moon Studios

Carlsbad, California
6.20.22
Senior Producer

Build a Rocket Boy Games

Edinburgh, Scotland
6.20.22
Lead UI Programmer
More Jobs   

CONNECT WITH US

Register for a
Subscribe to
Follow us

Game Developer Account

Game Developer Newsletter

@gamedevdotcom

Register for a

Game Developer Account

Gain full access to resources (events, white paper, webinars, reports, etc)
Single sign-on to all Informa products

Register
Subscribe to

Game Developer Newsletter

Get daily Game Developer top stories every morning straight into your inbox

Subscribe
Follow us

@gamedevdotcom

Follow us @gamedevdotcom to stay up-to-date with the latest news & insider information about events & more