For today's technical Gamasutra feature
, seasoned PC and console programming veteran Nathan Whitaker discusses optimizing tips on porting a next-gen SDK to current generation and handheld platforms - eventually getting a four-fold increase in performance.
"We recently had the rather daunting task of porting a next generation SDK to current generation and handheld platforms. This presented a variety of difficult challenges with one of the most rewarding being performance optimization. Over several weeks, we used the often overlooked methods in this article to gain an approximate 4x increase in performance and a healthy reduction in library size. A little assembly language was necessary but on the whole the bulk of the optimization was high level.
First Step – Profile Profile Profile
Before undertaking any optimization exercise it is necessary to establish a solid foundation for which to benchmark optimization changes. To do this we set up a simple repeatable scenario in our game. This scenario represented a rather typical but nonetheless demanding situation for the SDK. In our case, we captured approximately 30 seconds of profiling data on each run. Results were then saved and compared with subsequent profiling runs.
We spent a lot of time in the profiler doing essentially two different types of run. The first type was sample-based profiling where the profiler interrupts the application at a high frequency. With each interrupt the program counter is queried and the tuning software makes a note of which function is being executed. This method of profiling is crude in terms of accuracy (due to the overhead of interrupting/logging results) but it is useful for quickly spotting where most cycles are being spent. Once we’d established potential offenders we would then go through the source code and perform a second, more focused function specific profile. We were then able to quickly establish exactly where the time was being spent. The next step was then to figure out why."
You can now read the full Gamasutra feature on the topic
, including more from Whitaker on improving instruction and data cache usage and optimizing memory management (no registration required, please feel free to link to this feature from external websites).