Dan Amerson is technical director for Emergent's cross-platform Gamebryo Element engine, and Gamasutra recently had an opportunity to ask him about the company's Floodgate technology, which, as Amerson explains, provides a cross-platform API to build, schedule, execute and synchronize workflows of interdependent tasks.
Cross-platform engines face new challenges in the present console generation, and Amerson also discussed the feedback from developers that led to the implementation of new multithreaded solutions for Gamebryo -- and how this affects work on the single-core Wii, which Gamebryo also supports. Are developers pushing current platforms as far as they can go, and is it really so difficult to work with the PS3's Cell architecture?
Can you talk about how the Floodgate initiative began?
Dan Amerson: Late in 2005, we were working on Gamebryo 2.2 and received an invitation to present at Sony DevCon and DevStation 2006. Two of our senior engineers, Ed Holzwarth and Vincent Scheib, decided to put together a presentation that offered a lot of technical value and was not just a demonstration of our technology.
Given that we, like many other developers, were faced with the problem of converting our mostly single-threaded engine to run well on the PS3, they started to research techniques for leveraging the SPUs on the PS3 while reducing the workload on programmers.
The presentation that they put together offered several proposals for how to implement various systems on SPUs. Those proposals evolved into a design that we later implemented during Gamebryo 2.3. Our goal was to put together a system that both simplified our internal development and allowed our licensees to easily parallelize their game logic. There are a few features from that presentation that we have not implemented yet, but the design comes largely from that presentation.
While throwing ideas around with marketing a few months later, I came up with the name "Floodgate," since the technology controls and unleashes the pent up power on the consoles.
Were there specific needs the development community told you needed addressing?
We certainly received input from the people who saw our presentation. From that and our own internal engineering, we realized that Floodgate was solving a fundamental problem in this generation; each of these multicore machines is different, but studios can’t afford to rewrite multithreaded code for each platform.
Floodgate addresses that problem, and we realized that it was a powerful and useful solution for Emergent to offer to the market.
What is Floodgate used for?
At its core, Floodgate is a stream-processing engine. Developers provide execution kernels that are just code that runs per element of a stream and specify the inputs and outputs to the kernel. Inputs can be fixed across all elements or can be a stream of elements. These items allow developers to build up tasks that Floodgate will execute asynchronously.
On top of that, developers can link tasks together by specifying outputs from one task as inputs to another task. Floodgate uses that information to schedule tasks allowing complex sets of behavior to emerge. Most importantly, all of this work occurs through a single API allowing developers to target a new platform just by recompiling their kernels and application code.
How does Floodgate operate differently between Xbox 360 and PS3?
Once a Floodgate task is submitted, the system determines the most efficient way to execute that task. On PS3, the task is divided up into workloads and sent to the SPUs for execution. The system automatically determines the optimum size for each workload to minimize the number of DMA transfers. When the results are ready, they are recombined by Floodgate and made available to the application.
On Xbox 360 and multicore PCs, a similar process happens. The task is divided up into workloads. In this case, though, we distribute the workloads across multiple threads. Floodgate also determines the optimum size for these workloads to maximize data pre-fetching.
You've previously talked about additional applications that Floodgate can benefit -- particle systems, physics, etc -- can you explain how that works, and what more it might be useful for in future versions of Gamebryo? Can you explain the mesh modifier system you mentioned?
The mesh modifier system is very exciting because it’s enabled an even easier pathway for us to use Floodgate internal to our engine. Mesh modifiers are attached to mesh objects in Gamebryo. While they are not required to do so, most contain one or more Floodgate tasks that are associated with specific points in each frame of application execution.
When a mesh reaches that point during execution, it submits its mesh modifiers. At a later time, the mesh can wait on any Floodgate tasks to complete if necessary before progressing.
With this system, we’re really transforming code that was previously single threaded into high performance multithreaded code. Particle systems are just one example of the code that previously ran during a synchronous update call but now runs in parallel and on the PS3 SPUs using Floodgate.
Another great example is skinning or morphing. With mesh modifiers, we can associate a skinning or morphing modifier and set it to being working when the mesh is determined to be visible. For meshes that aren’t on screen, the mesh modifier system simply skips the submission of the Floodgate task.
Have you heard of any particularly unique uses of the tech in products currently on the market?
We’re always coming up with new uses for the technology internally. For example, there’s a really interesting water simulation in one of our samples that uses Floodgate to accelerate execution.
Since the first version of Floodgate was made available with Gamebryo 2.3 in late April, we haven’t seen the first wave of Gamebryo titles using the technology to hit the market. We can’t wait to see the variety of games and genres that are built with this technology.
Now that Gamebryo is coming to the Wii, what sacrifices has the technology had to make?
I wouldn’t say that we’ve made any sacrifices with our technology in taking it to Wii. We’ve always strove for maximum compatibility across platforms with Gamebryo, and our commitment to Wii is no exception.
With Gamebryo 2.3 on Wii we have almost all of the core features from Gamebryo 2.3 that the hardware will support. Obviously, there are some things that just aren’t portable to Wii from the other platforms. Then again, I can’t run a D3DX Effect on PS3, so I don’t consider limitations of the platform a sacrifice.
Is Floodgate's technology applicable to the Wii without a multicore processor?
That’s a tricky question to address, because it really depends on how you define “applicable.” We have Floodgate running on Wii, and you are right that there are not as many opportunities for performance gains with the technology on Wii, since it’s running on a single core.
However, we do our best to keep the overhead of using Floodgate minimal, so that companies can leverage code from their titles across all platforms without experiencing an overly high performance impact on platforms that don’t have as many cores. Remember that one of the goals with Floodgate was to reduce the workload on programmers. With that fact in mind, I’d say that it’s absolutely applicable.
Can you give a percentage of performance and features that the Wii can handle as compared to the 360 and PS3? Given your technical knowledge of the platform, are any games currently on the market pushing the tech as far as it can go? Can it be pushed further?
Coming up with a percentage is a perilous task, and it’s one that I won’t engage in these days. Any number that I give will be disputed the moment I say it. It’s not just an apples-to-oranges comparison; it’s like comparing apples to steaks. They’re both food, but the comparison really ends there.
It’s well understood that Nintendo chose a different strategy than Microsoft and Sony, and that the Wii is not as powerful from a raw numbers or features standpoint as PS3 and Xbox 360, so arguing over whether the GPU in Wii is 30% or 70% as powerful as the RSX or Xenos is a moot point.
As far as your second question goes, I’m pretty comfortable saying that no one is pushing the platform as far as it can go. Necessity is the mother of all invention, and developers will find ways to get more power out of all of these consoles. I think we saw that pretty clearly with the last round of titles in the previous generation. There really was no comparison between them and the launch titles.
We've seen and heard about a number of high profile delays and performance issues recently with the PS3, and you've presumably had to work quite closely with its architecture compared to the others -- what's your take? Is working with the Cell truly as difficult as the anecdotal evidence?
Programming for Cell requires managing a different set of problems than the shared memory machines that most developers are used to using. You have to make sure your data fits into the local stores of the SPUs, and you need to get that data to those SPUs.
If you write a multithreaded engine for a multicore or PC, you will almost definitely have to re-architect significant pieces to run well on the Cell architecture. They will run well if you invest enough development time though, there’s certainly power in that chip.
Since developer time is expensive and limited, we either see delays or performance issues when there isn’t enough time to get everything running correctly on the Cell architecture. Fundamentally, this goes back to your first question about how Floodgate got started. We recognized that there are very significant costs to rewriting systems for each platform. With Floodgate, you effectively get to write to one platform, Floodgate, and the port to other platforms is eased significantly by the technology under the hood.
You also seem to be in a unique position where rather than simply exploiting hardware to its limits, you're attempting to exploit it to its limits in order to level the playing field cross-platform -- is that a more difficult challenge?
It’s absolutely a more difficult problem, but no one buys software because it solves simple problems.