Sponsored By

Tool Postmortem: Ubi Soft Entertainment's GL for Playstation 2

When Ubi Soft researchers started thinking about developing GL for Playstation 2, the official OpenGL had not been released, and other 3D APIs were either under development or only for Japanese customers. Find out how building their own intern tools led to superior quality in Gamasutra's first tool postmortem.

June 19, 2001

20 Min Read

Author: by Han Da Qing


Ubi Soft's GL for Playstation 2 is an easy-to-use, cross-platform system with an extendable architecture. GL for Playstation 2's efficiency relies on its faculty for hiding the complexity of Playstation 2 and its lightweight graphic library.

At the Research and Development department, our work is to undertake general research in the game development field in order to provide innovative technical tools to the production department. Our department specializes in cutting-edge graphics research, and we had been implementing those research results using OpenGL on PC.

The GLPS2 adventure began as a research subject of an engineer who was porting an OpenGL sample onto Playstation 2 platform. He thought of doing a subset of OpenGL trimmed from Mesa Open Source. Later on, the research team had a need for such a library, and the production team started to show their interest in it. The official OpenGL on Playstation 2 had not been released at that time, and other 3D APIs were either under development or only for Japanese customers, with the exception of Renderware 3. The production department needed an efficient tool in a short period of time, ready to use with full support. On the other side, the R&D team had been working on Playstation 2 and wanted to put their research achievements into a library, so clients could use it. This library was meant to free the researchers of the general graphics research team from Playstation 2 relative issues, since they were specialized in advanced graphics and not Playstation 2.

The development of Playstation 2 applications is considered highly complicated, largely due to the sophisticated system architecture, the heavy assembly-level development and the lack of hardware functionality. We hoped to remove all difficulties related to the Playstation 2 to make developers' lives easier. The trade off is that OpenGL tends to hide all the underlying details which is restricting for game developers. Take VRAM management as an example: you'll never know where your buffers and textures are located with OpenGL. We hoped to give developers as much flexibility as possible, by providing plenty of Playstation 2-specific extensions and by revealing many underlying details, allowing the library to work together with lower-level APIs without too many difficulties.

The biggest challenge we anticipates was creating a balance between performance and OpenGL compatibility. We were in a dilemma from the very beginning, since sacrificing too much performance would make our product less attractive and useful for our clients, but sacrificing too much OpenGL compatibility would do the same.

Implementation and Application

The implementation went through three different phases: preparation and strategy, subset choice and implementation, platform-relative enhancing and optimizing.

In the first phase, we did a lot of research and testing on OpenGL and Playstation 2. We concluded on that OpenGL and Playstation 2 are conceptually different, and we should accept, allow and ensure the deviation. Hence we came to that overall strategy that GLPS2 should be a subset of OpenGL so as to hide the weaknesses of the specific platform, but GLPS2 also needed to be a superset of OpenGL in order to highlight the advantages of the platform.

In the second phase, we categorized OpenGL functions into groups and then set up different strategies for different groups of functions. Strictly game-related functionality should definitely be written from scratch for the sake of performance, and any non-game related, low-efficiency functionality should be omitted. For example, when quick prototyping is a priority, it's a blessing to adapt MESA free source partially or totally while leaving enough room for future enhancement and renewal.

In the final phase, we had to fully unearth the potential out of the Playstation 2 hardware. We wanted GLPS2 to handle all the Playstation 2 rendering features. None of these have counterparts in standard OpenGL, so we added them as new Playstation 2-specific extensions --instead of twisting existing OpenGL interfaces-- to avoid ruining the OpenGL compatibility.

OpenGL interface tends to hide as many underlying features as possible to ensure its easy-to-use nature, but Playstation 2 can't exhibit its optimal performance unless you think at the OS level and hardware level while coding. We thought about things this way while making the library, and it's ideal that for every library, users can also do this while writing their own code. Several problems arose at this stage, such as device competition, register conflict, etc. We focused our energies on avoiding these problems, such as allowing users to dynamically disable the access to devices like PAD and allowing the user to write to registers directly together with the library function calls. So far, GLPS2 has been used for the development of Disney's Dinosaur, and is planned to be used for VIP and other next-generation projects from Ubi Soft Entertainment.

What Went Right

1. Sufficient preparation. We did a lot of research in the early stages, which included research on Playstation 2-specific functionality. The team working on the project consisted of several engineers, and everyone worked on all of the research tasks. Our work was focused on studying the materials Sony provided, writing Sony library simulation using OpenGL, dissecting OpenGL and porting to OpenGL. The research was also based on learning assembly, working on general graphics and on Nurbs sky simulation, volume shadow, soft shadow, fluid, noice and turbulence, palatted texture, animation, four-Nurbs methods, and more. The management of the preparation process was excellent, and included distribution of test materials, and attendance of game development conferences in order to keep the team updated.

2. A global and complete framework. We took all the source material from Mesa, GLU, AUX and GLUT and did some heavy trimming. We developed a large framework with empty functions and filled in the gaps one by one. Then, we elaborated upon our concepts in a detailed document to avoid misdirection later in the project. For example, we didn't plan to implement accumulation buffer and stencil buffer in the early age of the development of GLPS2, since they were not supported by Playstation 2 hardware and we didn't want to implement too many things in the software. However, considering OpenGL specifications, we kept their interfaces and basic data structure and added some warnings in the document. Later on, we found some alternative methods to handle volume shadow and fast buffer blending and made full use of these existing standard interfaces, and we finally implemented some functionality without too many additional changes.


3. Worked tightly with project teams. Close cooperation with project teams changed the orientation of the development from compatibility to performance.

Each client was actually an aggressive tester, and their programs represented a practical test-bed. The library grew through their precious feedback. The game titles shipped using this library proved the quality and efficiency of our product, bringing in more clients --which was the best thing we could anticipate for a middleware product.

4. Rewriting the code. It was impossible to have the code well written the first time. For a library that needs repetitive upgrading and maintenance, an earlier rewriting did save a lot of hassle. From time to time, we discovered bottlenecks or design mistakes in our library that we decided to turn into a new implementation, discarding the previous method, obliterating all previous code, and destroying the previous integrity and reliability. Doing so was frustrating and annoying, especially when we felt we could still manage with the previous implementation that had already been fully optimized and thoroughly tested. Finally, after some rewriting and after a period of further development, the efforts began to be rewarding. Rewriting something that is not perfect is better than twisting the code to try to make it perfect.

5. We achieved a good level of performance. We made some choices in our architecture that led GLPS2 to a good level of performance. We chose the delay-mode execution of OpenGL function calls, a selectable path in VU1 pipeline, as well as dynamic vertex array and user maintained texture and CLUT. We also managed to distribute calculation between EE and VU1 in balance and to allow access to underlying Playstation 2 functions.

The choice of turning to delay mode execution was really a big step. In the earlier version of GLPS2, we implemented everything in immediate mode, which meant that glFlush and glFinish did nothing but maintain OpenGL compatibility. When we noticed that it was essential to let EE and VU1 work completely in parallel, we changed everything into delay mode, which meant that glFlush swapped the double DMA buffers and started up DMA transfer, while glFinish waited for the transfer to be completed. The difference between delay mode and immediate mode is, in a way, similar to the difference between writing into a file with or without cache--the performance is obviously different.

Its not difficult to understand how the use of programmable VU1 boosts performance if you understand the benefits of hardware transform and lighting. OpenGL is a big state machine with numerous setting combinations. The internal calculation, in terms of T and L and other vertex-based processes, varies according to the different environment settings. Thus you need to select the optimal path for minimum calculation, in order to achieve maximum performance, in any specific condition using the basis of OpenGL state settings. Lets say that some simple branch instructions will do, but unfortunately, branch instruction itself is time consuming in VU1, moreover, it is also the biggest obstacle for optimization. In this sense, our path-selecting algorithm is not only determines state--it also determines performance. Outside VU1 we translate OpenGL states into optimized internal states and then setup VU1 environment accordingly; inside VU1 we create one layered main pipeline and several fast paths with multiple entries and exits. Part of the workload was moved out of VU1, and with the help of OpenGL display list some work was moved out further from the list-replying time to the list-building time. In addition, we adapted some other approaches to cope with the remaining branching workloads that can't be moved out. For instance, when the cost of a certain function is less than the cost of some branch instructions, used to distinguish the paths with and without this function, we simply add the new function to existing paths first, and perform further optimization to hide this later. As for texture coordinate transformation, we separated UV offset from the texture matrix and performed the former within the transform routine and proceed the latter within the preparation routine only if it's required.

Dynamic vertex array is a mean to perform vertex-based animation, such as vertex blend and color blend. Similarly, user maintained texture is useful for generating procedure texture, and user maintained CLUT is designed for traditional CLUT animation. If you can understand the reason why DirectX provides an interface for rendering from user memory pointers, you will be happy to see our extensions involving user memory pointers. By making full use of the flexibility of the Playstation 2's smart DMA by using DMA reference chain), we minimize the cost of providing data pointers instead of the data itself. Take into account that the time saved by avoiding redundant memory copy, is also a performance benefit, as well as flexibility benefit.

Balancing EE and VU1 calculations, is a big issue with the Playstation 2. VU1 is known as highly efficient, however its resources are limited. The truth is that the VUs are designed to do specific calculations, that is, vector and matrix parallel calculation. In this sense, it's not wise to implant some portion of calculation, i.e., branch heavy calculation, into VU1. For example, we implement Cube mapping entirely outside VU1, multi-texture totally inside VU1, layered frustum&guardband bounding sphere culling mainly inside VU1, and Nurbs partially inside VU1. The Nurbs VU1 routine is on top of our VU1 T&L pipeline, hence it won't affect other part of our VU1 routines.

Another issue was to allow access to underlying Playstation 2 functions. GLPS2 can be regarded as a wrapper of underlying Playstation 2 functions. The simpler the wrapper, the higher the performance, however, the lower the functionality. It's a dilemma. So a straightforward solution is to allow mixed library access, not only for getting back all the missed functionality of the wrapper, but also for bypassing some parts of the wrapper as appropriate. However, a state-drive library (like OpenGL) isn't that friendly to mixed library access. Therefore, another altered way out is to leave a backdoor inside the wrapper, i.e., allowing access to underlying functions inside the library through specific interfaces. For example, it's up to you to decide whether the GLPS2 built-in texture management is suitable for your data. If not you can simply use glGet, a standard OpenGL function, to get back the texture base pointer, which is an imperative parameter of your own texture management function.

Our efforts to perform heavy assembly level development and optimization turn out to be a big win. For each development platform game developers always tend to dive into assembly level to achieve maximum performance. It's especially true for Playstation 2 developers because it's necessary to use assembly in some parts of Playstation 2 development process. For a library, it's imperative to employ overall optimization because the bottlenecks of a library varies in different conditions. Besides, there are many other approaches we have taken in making GLPS2 achieve better performances, which can't be itemized here. In general, half of them are Playstation 2 specific issues, you can find most of them in SCE news groups and FAQs, such as the issue about the transmission efficiency of 32 bit texture versus that of 4/8 bit texture, etc. And the other half are general optimization rules, like lookup tables based on Taylor expansion.

What Went Wrong

1. The transition from a research study to a project development. Initially GLPS2 was under development as a research subject, but it became a formal project development.
The direction of the research changed, the goal altered, and it lead to the current version of this library being to some degree general in order to respond to game production needs. We carried too many research achievements into the library from the very beginning, which was regarded by the clients as somehow slow. If we want to return to the original track of the research subject, what we are doing now is too specific to project development and is far away from OpenGL specifications.

2. Premature optimization. It was a dilemma whether or not we had to optimize the library under development before issuing a new release. A library is ever-evolving by nature, especially a library that's in use. However the optimization of a Playstation 2 application was bound to involve too many assembly level re-orders or and the like; we had to optimize again and again. For instance, each time when we needed to add a new feature or make a major modification inside VU1, we needed to break a lot of previously optimized and debugged code. After each modification we had to optimize and debug them all over again.


3. Lack of testing. There was no tester for this research project, the programmers did the job. We had to sometime we rely on our close clients -- programmers in a game project inside our company -- to test our library. A bug report from programmers in a game project is less reliable than one made by a tester. For one thing, programmers in a game project will concentrate merely on their own tasks and will not go out of their way to thoroughly test the tools they are using, like GLPS2. For another, programmers in a game project will mainly play with their unstable ongoing project, instead of specific little test sample, while using our library, so there are chances that the problem they found could be a real bug, could be a misuse, or even irrelative to the library. When a suspicious bug came up to us, we had to collect clues, guess the possible problems, reproduce them with our samples at hand or call for their reproduction in a simple sample, or even join their team and dive into their project eventually. Communication and debugging time is lost on both sides.

4. Defects of new development tools. We have used many tools in the development process of GLPS2, but most of the time we were not the first to try new tools. We should have paid more attention to middleware tools to make our own tools. We started the development when the low-level tools such as dsedb and ee-gdb were at an early stage of development, and got used to them, meaning we didn't have the most efficient and achieved tools to develop our library. We should have spent more time testing new tools, it would have saved time in the development of the library.

5. The system had performance bottlenecks. The VU1 low-efficiency is the consequence of the functionality and pipeline structure. There is also a lack of VU1 base features, such as texture coordinate autogen, index triangles and strips, vertex blending, etc. Of course almost all these features can be implemented outside VU1, either within GLPS2 or within user's code. However, the performance suffers in some cases. For example, it's proven that vertex blending can be handled smoothly and efficiently using the combination of VU0, DMA and SPR by our previous experience, but some evidence from other sources are in favor of having this job done in VU1. So it looks more like an application relative and data dependent issue here. On the EE side, performances suffered with overall structure and simplified algorithm. Concretely, the initial big framework made our lives easier, adding new features gradually, but it also introduced some side effects such as cache miss, introduced by function jump table and small functions. Since GLPS2 is only a subset of OpenGL, we took some assumption and gave up some features of standard OpenGL. It resulted in a side effect - too simple algorithms to serve some complex conditions well. Such as for texture management, we adopted random texture switching, and ignored texture priority. The underlying assumption is that Playstation 2 has a very limited VRAM with a relative fast transmission performance. In each frame there are large amounts of texture, that need to be transferred into VRAM. Thus, it's not very important to determine which one should be discard in a time.

The Bottom Line

Ubi Soft's GL for Playstation 2 makes game developer's lives easier on Playstation 2 platform. We hope to improve GLPS2 continuously according to the users need and feedback. GLPS2 is an example of how a self-made middleware product can streamline the productivity of a game company. To achieve ideal productivity, why not expand our vision from developing everything related to a game project from the wheels up to using a combination of third-party middleware products, building some of our own specific engines and tools, and making common libraries and tools -- just what we have done with GLPS2. Software reuse isn't a far-away oracle or a worn-out cliché.

Build or Buy?
At the time we started thinking of developing GLPS2, the official OpenGL had not been released, and other 3D APIs were either under development or only for Japanese customers. Building our own tools allowed us to have superior quality tools with internal support and the possibility of an on-site support.

We chose to make our own library not only because there was a lack of libraries on the market, but also because we wanted a flexible tool. For instance, we wanted the possibility of adding any features needed, integrating any kind of libraries and performing any modification users demanded. Our close clients can get all source and can do their own case by case modifications, trimming and extending for their game titles. This is how it works from now on at Ubi Soft, and we are having great success with a great tool!





Read more about:

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like