Introduction to COLLADA

In this extract from COLLADA: Sailing the Gulf of 3D Digital Content Creation, PlayStation 3 graphics architect Rémi Arnaud and Mark Barnes introduce the open standard technique for exchanging normally incompatible 3D assets.

Mark Barnes, Blogger

March 29, 2007

42 Min Read

The following is an excerpt from A K Peters, Ltd.'s COLLADA: Sailing the Gulf of 3D Digital Content Creation, specifically the book's opening chapter. It is reproduced here by permission of its original publisher and authors.

Overview

This chapter explains why the COLLADA technology has been developed. It provides a global view, defines the problems addressed by the technology, the main actors, and the goals and perspectives for this new technology. It provides an historic overview and information on how COLLADA is being designed and adopted. The goal is to give the reader an insight into how the design choices are made and how this technology might evolve.

Problem Domain

An interactive application is composed of two major components:

the application, which provides information in real time to the user and the means to interact with it;
the content, which contains the information through which the application navigates and provides a view to the user.

COLLADA focuses on the domain of interactive applications in the entertainment industry, where the content is three-dimensional and is a game or related interactive application. Therefore, the user of the application will be referred to as the player.

The types of information that can be provided to the player depend on the output devices available. Most games use one or several screens to display the visual information, sometimes a system with stereo visualization, and a set of speakers for the audio information. Often, some physical sensation can be rendered, as simple as a vibrating device embedded in the joystick, or as sophisticated as a moving cabin in arcade settings. The application may output, or render, several sensors at the same time, often in different places. An observer is a term that defines a group of sensors that move together. For example, an observer in a (virtual) car may have at least two visual sensors to represent the out-of-the-windows view (in this case, the view through the windshield) and the rear-mirror view. Several observers, which may be simultaneous users, can be displayed together by the same application, for example, in split-screen games where two or more players are sharing the same screen.

The content must include all data required by all the sensors that the application wants to use. The content can have multiple representations stored, partially sharing some of the elements. For example, if the content represents a landscape, different materials may be needed to represent the four seasons. The different representations of the data are called scenes.

Another type of interactive application is a training simulator in which the goal is to teach the user how to behave in real situations. These applications are not games, and the trainees’ reactions when they make a deadly mistake in a simulation make this quite clear. COLLADA does not focus on simulation applications, but it could certainly be used in this domain as well [1].

The computer-generated animation movie industry is also very interested in COLLADA. This application is completely scripted, not interactive. That industry’s goal is to be able to produce previsualization of scenes that look as final as possible, in a very small amount of time, which is why they are interested in integrating game technology in their production process.

Other types of applications may also profit from COLLADA, but its goal is to concentrate on interactive game applications, not to expand the problem domain. Other applications that require the same type of technologies will indirectly benefit from it.

Separation between Content and Runtime

The first interactive real-time applications rendering three-dimensional graphics required very expensive dedicated hardware and were used mainly in training simulation. Physical separation between content and runtime did not exist in the early applications (such as in the GE Apollo lunar landing trainer) [2]. The content was embedded in the code, or more specifically, some subroutine was coded to render a specific part of the content. Eventually, effort was made to store the embedded content as data arrays, and the code became increasingly generic so it could render all kinds of data.

The next logical step was to separate the data physically from the code. This allowed creating several products with the same application, but with different data. More products were defined by the data itself, and content creation soon became a completely separate task.

The real-time application was then referred to as the runtime, and the content for the runtime was stored in the runtime database. In the game industry, the runtime is called the game engine.

Digital content creation (DCC) tools were created, but the data structures and algorithms used for modeling did not match with the data that can be processed in real time by the application. DCC tools were also used by the movie industry for the production of computer-generated movies in which an advanced rendering engine was attached to the tool to produce a set of still images that compose the frames of the movie.

DCC tools and advanced rendering techniques, such as ray tracing [3] or shader languages such as RenderMan [4], required more advanced concepts than a real-time application could handle. Mathematical descriptions of surfaces such as splines and Bézier surfaces became necessary in the computeraided design (CAD) market [5].

Interactive applications needed both the advanced modeling techniques and the simpler representation usable in real time. Because of this, compilation techniques, used to create binary executable code from high-level languages, were adapted for the content processing. The database used in the DCC tool was therefore called the source. The data compiler takes the source data and creates the runtime data.

Figure 1.1. Content pipeline synopsis.

The runtime database soon became too large to fit in the memory of the target system. The content had to be sliced and paged in real time, depending on where the observers were. This quite challenging problem sometimes required specific hardware assistance and necessitated a very specific encoding of the content. Specific algorithms had to be developed for terrain paging [6] and texture paging [7].

Because of economic constraints, severe limitations exist in hardware targeted by the game industry. The data must be organized in the most optimal way possible. For performance optimization, many game applications use their own file system and combine the various elements in a single file. Some developers run optimization programs for hours to find the best placement of the elements for better interactivity. The idea is to optimize seek time and place data accordingly. This is very similar to the optimization section of a compiler, with a complexity similar to the NP-complete salesman optimization problem [8].

Another example of paging technology used outside the game industry is the Google Earth application [9]. This application enables the user to look down on the planet from any altitude and render a view based on satellite altimetery and imagery information. It is the result of generalizing the terrain- and image paging technology developed for high-end simulation applications [10].

Such applications handle only static data and have limited interactivity for the user to move around in the environment. However, most applications, especially in the entertainment industry, require the content to be highly dynamic. For example, animated objects are needed, either objects moving independently in a scene (moving object) or prescripted animations controlled by various parameters, such as a windsock depending on the direction and speed of the wind, or a feature like a bridge that can have different representations depending on whether or not it has been destroyed.

The objects in the runtime database are often organized in a graph, where each branch represents the relative position of the objects or different choices of multiple representations. This data organization is commonly referred to as the scene graph, and several runtime technologies have been developed to exploit this [11].

Some early rendering techniques used a particular organization of the scene graph to determine how the objects hide each other in the view, such as the BSP method [12]. Early hardware accelerator performance was greatly affected by the objects sent outside of the camera’s field of view, so scene graphs were organized with a container-type relation. All the “children’’ bounding volumes are enclosed in the “parent’’ bounding volume; therefore a culling operation can cut entire branches if the parent is not in the field of view.

More complex dynamic behavior is now required for interactive applications. The content needs to evolve following direct interaction with the user. This can be physical interaction with the virtual representation of the user in the application, or it can be indirectly linked to user interaction, such as when a pile of boxes collapses if one of the boxes is removed. Very complex control systems are developed to combine scripted animation, artificial intelligence (AI), physical simulation, and user control.

In other words, the content must be designed for interactivity, not only for the interaction with the user but also for the interactivity between the different elements of the content. The relationship between objects is much more complex, and the scene-graph technology is reaching its limit in capacity, because the graph becomes too complex and overloaded with many interdependent relationships. The dynamic nature of the entertainment application is such that the scene-graph technology is not sufficient to manage all of the content. Instead, hybrid techniques are developed, often targeting a specific game genre. Fortunately, modern rendering hardware performance is less impacted by sending objects outside the field of view, so the main rendering optimization that the scene-graph technology required is no longer an issue.

The process of creating the runtime database from the source database has become more complex and resource-intensive over time. The simple link between the content creation and the runtime is now an entity of its own, called the content pipeline. With the need for larger and more interactive content, game developers have to spend substantial resources on this technology.

The Content Pipeline

The content pipeline is composed of the following elements:

digital content creation (DCC) tools used by artists to create the source data;
the exporter, a program written for a given DCC tool that permits the content to be extracted from the DCC tool;
the conditioning pipeline, a set of programs that apply several transformations to the content, such as geometry cleaning and optimizing for fast rendering;
the runtime database, specifically encoded for a given runtime and often for a given target platform.

Figure 1.2. The content pipeline.

A novice may ask why it is necessary to write an exporter, since the DCC tool already saves the data in the source database, or why an importer couldn’t be created as part of the conditioning pipeline tools. There are, in fact, many difficulties that make these solutions impractical.

The main problem is that the format used by the DCC tool is often proprietary, so it is not possible to write an importer. Even when the format is available, it may be very complex to read and require knowledge of the DCC tool algorithms that are not available to users.

In practice, the exporter is an integral part of the pipeline, since it is already doing some processing on the source data, utilizing the DCC built-in functions to convert internal representation to a more usable format. From the point of view of compiler technology, the exporter is actually the front end of the data compiler, and the data produced by the exporter is the intermediate format.

To enable users to extract the data, DCC tools typically offer a software development kit (SDK) that provides an application programming interface (API) to interact with the internal representations. Some DCC tools provide different APIs, depending on the type of data most often needed by the application. For instance, a game SDK is sometimes provided specifically to help game developers.

DCC tool vendors prefer providing and supporting SDKs rather than publishing the details of their internals and supporting developers tinkering with reading their proprietary files directly. The main reason for DCC tools to do this is that they need to be able to provide new releases. If there were applications depending on a given internal structure, this would make major improvements very difficult. It is thus much easier to hide the internals and provide stability at the SDK level.

Therefore, application developers are forced to create exporters. Even with the relative stability of the SDK, developers must still continuously update their exporters if they want to keep up with new releases of tools. Experience shows that developers often decide to stick to one specific release of a particular DCC tool for a given production, since the cost and risk associated with constantly updating the content pipeline is too intensive. This also affects the development of advanced technology by DCC vendors for game developers who are in the middle of a game-development cycle.

Interestingly, there is also a business advantage since this locks developers into using given vendors and restricts the window of time in which competitors can have a chance at acquiring a new customer. Retooling often happens only when new hardware is introduced, whether it is a new-generation console or a new type of hardware such as a mobile phone. Even then, switching from one DCC tool to another is problematic because of the artists’ familiarity with a given DCC tool interface.

Exporter development is not an easy task, but it is forced upon developers. All game developers agree that creating and maintaining an exporter is very time consuming. Nevertheless, they also understand that the content pipeline is a piece of technology they have to master and cannot depend on pieces that are not perfect.

Not only is the resulting game content limited by the game developer’s capacity in developing a good content pipeline, but more often, better tools or technologies cannot be introduced because of the lack of flexibility in the design of content pipelines.

COLLADA was created to address this worrisome situation.

Problem Description

To export the data, developers have to design a format in which to export it. This is no simple task, especially because the format must be flexible enough to withstand changes in requirements during the development process, such as introducing new data types.

Once the data is exported, developers still have to write an importer for this data into their content processing facility. Some developers try to solve the problem by having the entire content pipeline contained in the exporter code, so that the output of the export process is directly in the format needed by the runtime. This approach is often problematic since the plug-in interface is not designed to handle such complex applications. It is also the cause of major maintenance problems when the DCC application SDK is evolving and because of the complete lock-in to a given DCC tool and often to a specific version of this tool.

Another approach developers take is to limit the input data to the simplest data, such as geometry, texture mapping, images, and animation curves, and to create a set of integrated tools to create the remaining data, often including the game engine as the means to visualize the data. This proves to be a quite successful approach for developers whose goal is to concentrate on a specific subset of game applications and who can create their content with a relatively small team. Such a tool, if well designed and implemented, provides artists and game designers with a very short feedback loop between content creation and its visualization in the runtime. On the other hand, this approach has several limitations. For example, the edits made in the tool cannot be pushed up the pipeline; therefore, if the input data needs to be modified, all the edits have to be done again. Another drawback is that it is impossible to use external technologies without integrating those directly into the tool itself.

These approaches are in fact contrary to the improvement of the situation, which should be to require an opening up of the tool pipeline to enable developers to use a variety of independent tools, easing the introduction of new technologies, and making possible the adaptation of the content pipeline to be used by larger teams and for all genres of games.

The Zoo

Content pipelines often use more than one intermediate format, since each tool may have its own export format or may not provide an SDK or plugin facilities for developers to use their own format. In other words, this is a zoo!

The data is exported from the modeler in a given format. An independent tool is used to adjust some properties that the DCC tool does not understand, then another tool is used to gather several assets into a game level, which is then sent to the final optimizer that will create the game-specific runtime format. Each of those tools may very well use independent and incompatible formats since it is difficult to create one format for all. It is so much more convenient for individuals to create their own tools rather than having to collaborate and spend valuable time on making compromises.

Figure 1.3. A typical zoo.

This process has several limitations.

The content pipeline is one-way. Therefore, changes cannot be saved back into the source database. When changes need to be done from the source, the data must go through the entire processing of the content pipeline before they can be seen in the level-editor tool. This process takes time, thus impacting productivity.
The tools are not interchangeable, which often creates situations where data have to be propagated through “back doors’’ when a change occurs in the process.
There is no easy way to create shortcuts in the process to enhance productivity.

Artists need to see their work running in the runtime. Ideally, they need to be able to do so without having to go through the entire content pipeline process. Therefore, a separate path, called the fast-path, needs to be created in parallel. To maintain the highest possible performance, only the data that is modified will go through the fast-path; the rest of the data will be loaded in the optimized runtime format. During production, it is necessary for the game engine to be capable of loading both the optimized data and the intermediate format.

A Common Intermediate Format

Everything would be much simpler if all the tools in the content pipeline could export and import a well-defined common format. Developers would not need to write and maintain their own exporters, and the data would be available directly to the content pipeline.

COLLADA’s goal is to foster the development of a more advanced content pipeline by standardizing a common intermediate representation, encouraging better quality content, and bringing many advanced features to the standard. COLLADA should be the transport mechanism between the various tools in the content pipeline.

But is it possible to establish such a common format?

DCC tools have been developed independently and may have very different ways of describing the data. Some tools have very specific attributes that some developers want to use in their pipeline but which would be impossible to export from other DCC tools. Defining a set of common representations that can be exported by all the tools and making sure that the tool-specific parameters can still be represented is a hard task.

The DCC Vendors

In the entertainment industry, there are three major DCC tools:

3ds Max;
Maya;
XSI.

All vendors understand the value of a common format, but for a different reason. They are seeking not only an intermediate format but also an interchange format.

The goal of an interchange format is to enable the data to move freely from one tool to another. The main idea is to enable several DCC tools to be used by developers in the same production or to enable developers to switch easily from one main DCC vendor to another DCC vendor. Of course, a DCC vendor who has a much larger market share than the others may not be interested in risking it and may not want a common interchange format to exist, or at least not one that is not under his control.

Recently, Autodesk, who owned 3ds Max, acquired Maya. This consolidation of the market creates a difficult situation for game developers since Autodesk may use its strong position to reduce the interchangeability of data with external tools. At the same time, the need for interoperability between Maya and 3ds Max is growing, unless one of the tools is to be scavenged and merged into the other one. This recent move makes it both more important and more difficult for the existence of a common interchange format.

Most of the time, developers use a single DCC tool in their production pipeline. Another product may have a set of functionalities that would improve developer productivity, but the only way for the new tool to be usable would be to have it inserted in the current content pipeline. The new tool has to be able to interchange data with the primary tool.

Softimage had long ago developed the dotXSI format and SDK to enable interchangeability between tools. They have published the format publicly and have created exporters and importers for the other DCC tools available in open source [13].

The problem is that even if freely available, dotXSI is designed and owned by one of the DCC vendors and therefore competing vendors cannot be expected to fully support it.

The main drawback from the user’s point of view for one single company to own the intermediate format is that they are then in the position to define exactly what features are to be represented in the format, thus making it impossible for the other vendors to innovate since any advanced features they might add would not be used by developers until available in the interchange format.

Smaller tool companies have the same problem and have to interface with all of the major tools. Most partner with the DCC vendors and create importers and exporters or embed their application as a plug-in.

A Collaboration Model

Other industries have experienced a similar problem. COLLADA design methodology has been modeled from the experience in the 3D graphics accelerator industry. There used to be many 3D APIs, such as PHIGS, GKS-3D, Doré, HOOPS, Glide, and IrisGL, promoted by various vendors, but the industry really took off only when standard APIs were adopted. The two current mainstream APIs—Direct3D and OpenGL—were established as standards and created in 1992 and 1995, respectively [14].

The OpenGL proposition was to create a committee, the OpenGL Architecture Review Board (ARB), regrouping all the providers who would collaborate to design a common API. This approach proved successful despite serious competition among the vendors.

Direct3D was created by Microsoft and became a de facto standard in the PC market simply because of the large share of the market that Microsoft operating systems have.

One major difference between OpenGL and Direct3D is the opportunity for hardware vendors to provide exclusive extensions to the API, enabling them to innovate and better serve their specific customer needs. This advantage can also be a disadvantage because if too many vendor-specific extensions are created, each vendor ends up with a different API, thus weakening the standard. Therefore, there is a significant interest for the companies belonging to the OpenGL ARB to compromise and offer a common API.

Since the lack of a standard intermediate format is hurting interactive industry developers, it ultimately hurts the major platform vendors in this industry, in particular, the game-console vendors.

Each generation of game consoles exponentially adds to the demand for interactive content, in quantity as well as in quality. The cost and quality of content production is directly proportional to the quality of the content pipeline, and especially the quality of the exporters.

The authors of this book, both working in the US R&D department of Sony Computer Entertainment (SCE), started the project of a standard intermediate format that would be designed in partnership with the major tool vendors. During SIGGRAPH ’03, the first meetings took place, and thanks to the dominant position of SCE in the entertainment industry, the three main DCC vendors agreed to participate in this project.

This project became known as COLLADA, an acronym for COLLAborative Design Activity [15].

Intermediate or Interchange Format?

Similarly, the design goal for COLLADA has been to enable not only exporting data from the DCC tools but also reimporting it. The main reason for this is to enable a large variety of tools to be able to interact with the primary DCC tool chosen by the game developer, using COLLADA as a conduit. This is to answer the growing need for developers to be able to use (or create) utilities that will fit in their content pipeline without having to learn how to embed their utility inside the main DCC. It also enables the middleware industry to create external utility tools with specific tasks, such as triangulating, cleaning up the geometry, and remapping texture coordinates. Potentially, the open-source community could be involved in this, although the game community is currently not inclined to use or participate in open-source projects, mostly due to the risk of patent infringement.

In the model shown in Figure 1.3, the source database is stored in the DCC tool binary format, and COLLADA is used as the intermediate format between the various tools. The communication between tools can be direct, but most often it is done using a file system.

Since one of the external tools can be another DCC tool, can COLLADA be used as an interchange format? It certainly is an interchange format, but it is limited to the content that can be stored in COLLADA and the capability for the DCC tools to recognize this content.

COLLADA includes the notion of techniques and profiles (see “Techniques, Profiles, and Extras,’’ page 34). Each COLLADA-compatible tool has to correctly interpret the data in the “common” technique; this is mandatory to make sure COLLADA has a subset that is universally recognized.

The purpose of profiles is to enable tools or game developers to extend the format by creating a specific section in the content that is clearly marked by a label indicating the nonstandardized part of the content. Although the application-specific techniques cannot be interpreted by other tools, it is still written using the same syntax rules, so it can still be loaded by all the applications.

Ideally, a COLLADA document going through the process of importing and then exporting from a given tool should keep all the extra data. This rule is fundamental to enable developers to extend the format in the way they need without having to change the exporter and importer code for all the tools they are using [16].

The goal for an interchange format is to be able to transport all the data from one DCC tool to another without any loss and then to reverse the operation. This is not the primary design goal for COLLADA. This would be equivalent to having a compiler be able to recreate the source from the intermediate representation.

On the other hand, it is possible that eventually some of the tools may not have their own format and may use COLLADA as their native format. In that case, COLLADA would become the source format. Although this is not the current goal for COLLADA, much attention has been put in the design to enable this possible usage.

Figure 1.4. Content pipeline with a DCC tool and external utilities.

This approach would have the benefit of solving the data archival issue. Currently the only way to archive the source assets of an application is to store all the tools, including the DCC tools, with the assets in order to be able to reuse the asset later. Even if the DCC tools can be retrieved, it is not guaranteed that they can be run again on future development systems, since they often rely on a specific version of an operating systems and/or hardware configuration. In addition, it may be impossible to find a proper license for those applications in the future. Storing all the source data in an open standard such as COLLADA would significantly raise the possibility of using the assets in the future.

Another unexpected usage of COLLADA by some companies is to utilize this format, or a binary equivalent representation, directly as a runtime format. COLLADA is definitely not designed as a target format, but it turns out that it can be suitable for being used directly by the runtime. This is mostly due to the massive deployment of the XML technology on many devices, which makes it easier to interface directly with COLLADA documents. This does not mean that the content pipeline would be reduced to nothing; it just means that the conditioning of the data can be done directly in the COLLADA format.

COLLADA: A Short History

It is important for the graphics community to participate in the design of a standard intermediate format, in order to avoid the situation where one single vendor dictates what features should be included and then uses their position to eliminate competition.

Following SIGGRAPH ’03, Sony Computer Entertainment did a thorough analysis of all the existing formats and, in the process, involved several other companies in the game industry, notably Criterion Software, which was at the time the most successful independent middleware company with their product RenderWare [17].

Many other companies became involved in the project:

Vicarious Vision, a game developer that also was a middleware provider with the product Alchemy, which they acquired with the purchase of Intrinsic Graphics [18];
Emdigo, a start-up in the mobile space, using COLLADA for 3D graphics on cellular phones [19];
Novodex, a real-time game physics company [20];
Discreet, representing 3ds Max [21];
Alias, representing Maya [22];
Softimage, representing XSI [23].

An informal working group was established, and after a year of weekly meetings, the COLLADA 1.0 specification was produced. The goal was that every partner would be satisfied with the specification so that DCC vendors would create importer/exporter plug-ins for the format. SCE insisted that the specification and plug-ins source code would be made public. It was not an easy task to get agreement on even the most basic definitions. More than one company pushed their current format to be adopted instead of creating a new one, which was not possible because of the highly competitive industry, and because a completely open format was needed without any intellectual property (IP) issues.

The work paid off. At SIGGRAPH ’04, the first public presentation of COLLADA 1.0 was made in a sponsored Tech Talk [24]. Even though the presentation was about a public open source project, SIGGRAPH organizers looked at the COLLADA presentation as a commercial project and requested sponsorship for it.

The presentation included numerous demonstrations of COLLADA content exported by one DCC vendor and then loaded back into another, modified and saved back, and loaded again into another DCC tool or into a middleware content pipeline. COLLADA content was also demonstrated running simultaneously on Xbox® and PlayStation®2 on the same screen with a video split-screen mechanism. In addition, a reduced version of the same content was created and demonstrated on a mobile phone.

This presentation surprised a lot of people and annoyed some. The audience was amazed at seeing the DCC vendors helping each other to make sure the data was correctly exchanged between their tools.

Why was SCE sponsoring an interchange format? The audience expected SCE to focus only on the PlayStation® platforms and avoid cross-platform capabilities by developing only proprietary technologies.

Many developers reported they needed a common intermediate format and thanked the partners for working on it, but it was so sudden that they said they would wait before embracing it, since it could go away as fast as it appeared!

Some complained that the COLLADA group was defining yet another format and instead should have embraced an existing format. The most upset crowd was from the X3D community that has worked to create a format for 3D content visualization for the Web [25], which used some of the same basic constructs used in COLLADA. The main difference between the two formats is the result of the fact that X3D is derived from the VRML scene graph concept for Web browsing of 3D assets, while COLLADA is targeting advanced game applications.

The COLLADA 1.0 specification concentrated on putting together the basic elements of the format. It was missing some fundamental features to be usable by game developers, such as animation. The main goal was to get the community excited about this project and gather feedback.

Overall, the presentation was quite successful, since it produced more involvement from the game and middleware community and gathered a large amount of feedback, which resulted in several revisions of the format (1.1, 1.2, 1.3, 1.3.1, and 1.4) [26].

More companies joined the COLLADA working group: ATI [27], NVIDIA [28], 3Dlabs [29], and Nokia [30].

The design committee work progressed with the objective to bring to COLLADA all the features required by the game industry.

A lot of work was needed to improve the quality of the plug-ins and refine the specification. At SIGGRAPH ’04, the only way a plug-in was tested was by using the content for the demonstration. It quickly became clear that this was not sufficient, since all the early adopters encountered so many problems that it made the technology usable only for very basic applications.

SCE started to put in place a conformance test, a really complex task. Once the early version of the conformance test was put in place and early adopters had provided feedback, a bug database was put in place, revealing some fundamental issues with the plug-ins, sometimes revealing aspects of the specification that were too loosely defined.

The speed at which bugs were fixed was slow because the amount of resources available at the DCC vendors for this project was limited and directly related to the number of customers requesting it—a typical “chicken and egg” problem. If the quality of the plug-ins was not improved significantly, developers would not use the technology; if developers were not using the technology, the plug-ins would not improve.

The situation progressed slowly, but in the right direction. One favorable event was the fact that SCE announced that COLLADA would be the official format for the PlayStation®3 SDK, which was a target for the DCC vendors [31]. Another big push came from the hardware vendors who were suffering from the problem of a lack of standard format. They have to develop numerous plug-ins and deal with the fact that each of their customers has a different format, which makes it expensive and complex to support. In addition, hardware vendors are already accustomed to collaborations for standardization. With the addition of features required by game developers, such as animation and skin and bones, early adopters started to use COLLADA in their tool chain, putting more pressure on DCC vendors to do a good job on their exporters/importers.

Another lesson learned during this year was that, in order to be adopted, COLLADA would need to provide an API for developers to load, modify, and save COLLADA content easily. This was something we wanted to avoid, first because it is more work, but also because there are several commercial tools already available for this task that could be adapted for COLLADA. However, many developers were waiting for an official API to access COLLADA content from their application.

SCE then decided to start working on the COLLADA DOM, a source code providing a C++ object model that reflects the COLLADA elements in the application memory. The first version was subcontracted to Emdigo and was limited to a sample code to load and save COLLADA documents, but it lacked the capability to edit the data in place. After delivery of this sample code, SCE dedicated resources to improve this code and add the missing features to make it really useful. The COLLADA DOM was finally released as open source in January 2006 (see “The COLLADA DOM,” page 167).

Another direction of improvement was to determine what the most important features were for the next-generation (PlayStation®3, Xbox® 360) content and to add those to the specification and the plug-ins. The decision was to focus on adding shader effects and physical properties to COLLADA.

At SIGGRAPH ’05, the first anniversary Tech Talk presentation was made. In addition, COLLADA presentations were made in in September 2005 in Europe at Eurographics ’05 and in Japan at CEDEC ’05.

These presentations were a sneak preview of COLLADA 1.4 features, described in this book, which include shader effects (COLLADA FX ) and real-time physics and collisions (COLLADA Physics). Once again, numerous demonstrations were made by many partners, showing more tools.

Softimage demonstrated XSI supporting both the shader effects and physics. They also demonstrated the new version of their viewer (the XSI viewer), a fast-path visualization tool capable of displaying all the new features simultaneously.
Alias demonstrated how the COLLADA external reference system can be used to improve productivity.
Discreet demonstrated work-in-progress of their shader and physics implementation.
Nokia and Emdigo demonstrated COLLADA content running on mobile phones.
NVIDIA demonstrated an alpha version of FX Composer 2.0, a shader-effect tool based on the COLLADA FX specification. The files created by FX Composer were then loaded back into DCC tools and assigned to several objects before being saved back and visualized with external viewers.
Feeling Software demonstrated Nima, a plug-in based on AGEIA’s PhysX that enables authoring and exporting of COLLADA Physics content inside Maya.

The technologies demonstrated were very advanced. Never before had there been a shader-effect format capable of using the Cg, HLSL, and GLSL languages. Never before had shader effects been exchanged back and forth between DCC tools and external tools. Never before had common physical parameters been exchanged between several tools and viewers. Although COLLADA 1.4 was to be delivered publicly months later, the presentations were very successful, and hundreds of developers were impatient to have access to this new release.

COLLADA: An Industry Open Standard

An important announcement at SIGGRAPH ’05 was that the Khronos Group had accepted COLLADA as an industry standard [32], along with OpenGL ES and several other real-time APIs. This was a very important step for COLLADA since its specification had been ratified by the Khronos Group promoters, a significant group of companies, and had finally reached an official industry-standard status.

COLLADA now has a life of its own (and will survive even if the original partners change their minds in the future), providing the stability that is necessary for the majority of developers, tool vendors, and middleware companies to invest in it. In addition, the Khronos Group provides the necessary IP protection, since ratification from the Khronos Group means that all the members have agreed that COLLADA did not infringe on any IP they owned, or, if that was the case, they agreed to provide an IP license to anyone using COLLADA.

Getting COLLADA to be accepted by the Khronos Group was no easy task. The major difficulty was that COLLADA was not an API, and the Khronos Group had only dealt with APIs in the past. Much convincing was necessary for the members to accept that common APIs were not enough, but that they also had to make sure that content was available in a standard format.

On the other hand, the original partners were balancing the benefit of having COLLADA as a formalized open standard, with the perception that SCE was abandoning the project to the Khronos Group. Because of this, SCE decided to become a Promoter member of Khronos, the highest rank of partnership. This was necessary for SCE to affirm their continuous involvement in the project, which was key in its development.

Only two years after the project started, and one year after being publicly announced, COLLADA partners were successful in creating an industry standard targeted for the entertainment industry.

The authors want to thank all the partners and congratulate them for this exceptional result.

Notes and References

[1] The authors received an e-mail from a Lead Systems Engineer on a US Army simulator program mentioning their interest to use COLLADA as a database format.

[2] The Lunar Module Mission Simulator was used at the Kennedy Space Center between 1968 and 1972. It was used by every Apollo astronaut to train prior to their mission. Cameras controlled by a computer, filming a model of the lunar surface, projected the image in front of the four windows so the astronauts would feel as if they were actually maneuvering for a landing on the Moon. In this early real-time image generator, the database was in hardware!

[3] There are over 700 bibliographic references on ray-tracing techniques. Andrew S. Glassner’s book, An Introduction to Ray Tracing, first published in 1989 by Morgan Kaufmann is a good reference book. PovRay (Persistence of Vision Raytracer) is a free ray-tracing tool available for many platforms and also in source code (http://www.povray.org/).

[4] Pixar Animation Studios created the RenderMan rendering technology to generate their own feature film productions. Since its introduction in the 1990s, it has become a standard tool in many computer graphics studios (http://renderman.pixar.com/).

[5] Computer-aided design (CAD) is a very large market. Originally created for the automobile and the aerospace industries, these DCC tools are now ubiquitous in the manufacturing industry. Pierre Bézier, a French mathematician who died November 25, 1999, was one of the early pioneers. While working at Renault, a French automaker, he invented a method of describing any 2nd degree curve using only four points, which is now referred to as the Bézier curve.

[6] General Electric was a pioneer in terrain paging capability for their IMAGE series. Terrain paging is a complex feature that requires specific hardware such as direct DMA engines from disk to main memory and fast disk array systems. A lot of information on terrain paging and related algorithms can be found on the World Wide Web (http://www.vterrain.org/).

[7] Christopher C. Tanner, Christopher J. Migdal, and Michael T. Jones. “The Clipmap: A Virtual Mipmap.’’ In Proceedings of SIGGRAPH 98, Computer Graphics Proceedings, Annual Conference Series, edited by Michael Cohen, pp. 151–242, Reading, MA: Addison Wesley, 1998.

[8] In complexity theory, the NP-complete problems are the most difficult problems in NP (nondeterministic polynomial time).

[9] Google Earth lets you browse the Earth from your computer, paging both terrain and satellite images in real time from the Internet. It is freely available (http://earth.google.com/).

[10] Christopher Tanner and Rémi Arnaud created a prototype of this technology as a demonstration when looking for venture money to finance the Intrinsic Graphics start-up. Once Intrinsic Graphics was financed, it focused on middleware for the game market. The original technology was so compelling that Keyhole was created as a separate entity to create a product. This company was later bought by Google, and the product became Google Earth.

[11] Iris Performer, a very popular scene graph, was developed at Silicon Graphics and is still in use today. J. Rohlf and J. Helman. “IRIS Performer: A High Performance Multiprocessing Toolkit for Real-Time 3D Graphics.” In Proceedings of SIGGRAPH 94, Computer Graphics Proceedings, Annual Conference Series, edited by Andrew Glassner. pp. 381—395, New York: ACM Press, 1994. OpenSceneGraph was first created as an open-source equivalent, since Performer was originally not available on any other platform than the SGI computer. Industry-standard database formats, such as Multigen OpenFlight, were created and are still in use today in the simulation industry.

[12] Binary space partition (BSP) trees were first used in the early flight simulators’ visual systems to determine a hidden part removal algorithm. The idea is to determine the drawing order of all the geometry so that the hidden parts are covered by the visible parts. Also known as the painter’s algorithm, this was used in early 3D games before the hardware accelerated Z-buffer was widely available in graphics hardware accelerators.

[13] DotXSI and the FTK (File Transfer Toolkit) were created by Softimage (http://softimage.com/products/xsi/pipeline_tools/dot_xsi_format/).

[14] Historical information on the creation of OpenGL and DirectX can be found on Wikipedia (http://en.wikipedia.org/wiki/Opengl#History) (http://en.wikipedia.org/wiki/Direct_x#History).

[15] The name COLLADA was coined by the engineers in R&D, since several projects had code names that were named after winds. A collada is a strong north or northwest wind blowing in the upper part of the Gulf of California, but blowing from northeast in the lower part of the Gulf. The acronym COLLAborative Design Activity was then created by Attila Vass, senior manager in the SCE US R&D department.

[16] Unfortunately, this relies on the capability of the DCC tools to be flexible enough to store the extra data in their internal representation. A discussion on this project is available on the World Wide Web (https://collada.org/public_forum/viewtopic.php?t=312).

[17] Criterion has since been acquired by Electronic Arts. EA has made RenderWare their main tool. Although the product was still available as a middleware on the market, the loss of independence rapidly impacted their ability to sell.
The other game developers could not afford to depend on their competitor’s technology.

[18] Vicarious Vision was later purchased by Activision, eliminating another independent middleware vendor.

[19] The Emdigo website (http://www.emdigo.com/).

[20] AGEIA acquired Novodex in 2004, which gave them the PhysX SDK. In September 2005, they acquired Meqon Research AB, consolidating the market for the physics engine for game development.

[21] Discreet and Autodesk offer a wide range of products for the media and entertainment market (http://www.discreet.com/) (http://www.autodesk.com/).

[22] Alias had been owned by Silicon Graphics, Inc. since 1995. It was acquired by Accel-KKR, an equity investment firm, for $57M in April 2004. It was sold back to Autodesk in January 2006 for $197M. Autodesk now has both 3ds Max and Maya DCC tools, representing close to 80% of the tools used in the game industry (http://www.alias.com/) (http://www.autodesk.com/).

[23] Softimage, an Avid company, is the maker of the XSI DCC tool (http://www.softimage.com/) (http://www.avid.com/).

[24] R. Arnaud, M. Barnes. “COLLADA: An Open Interchange File Format for the Interactive 3D Industry.” SIGGRAPH ’04 Exhibitor Tech Talk, 2004.

[25] X3D has also decided to use XML as the base technology (http://www.web3d.org/). Unfortunately, X3D design is not very popular among game developers, since it was designed for a different domain of application: 3D for the Web. Later on, the X3D community created a document that shows that the two designs are very different (http://realism.com/Web3D/Collada/Nodes).

[26] COLLADA 1.1 was published in December 2003. COLLADA 1.2 was published in January 2004 as a patch release. COLLADA 1.3 was released in March 2005, introducing only a few features such as skinning but improving the conformance test and the quality of plug-ins. There was a COLLADA 1.3.1 patch release in August 2005. The specification stayed quite stable between 1.1 and 1.3.1, waiting for DCC vendors to create quality tools and developers to start using the technology. All those releases were done under an SCE copyright and licensing. COLLADA 1.4 was released in January 2006 under the Khronos umbrella, introducing major features such as COLLADA FX , COLLADA Physics, and a new design philosophy based on strong typing. Once again, the specification is in a stable phase, waiting for good implementations to be available. With the additional interest from developers, the cycle will be much shorter this time.

[27] ATI website (http://www.ati.com/).

[28] NVIDIA website (http://www.nvidia.com/).

[29] 3Dlabs website (http://www.3dlabs.com/). In February 2006, 3Dlabs decided to drop their desktop division and concentrate on the mobile embedded market space.

[30] Nokia website (http://www.nokia.com/).

[31] At the 2005 PlayStation press conference, Masami Chatani (CTO) announced that the PS3’s development environment will support COLLADA. This raised the interest in COLLADA from game developers.

[32] Khronos Group. “COLLADA Approved by the Khronos Group as Open Standard.” Press Release, July 29, 2005 (http://www.scei.co.jp/corporate/release/pdf/050729e.pdf).

About the Author(s)

Mark Barnes

Blogger

Mark Barnes joined Sony Computer Entertainment US R&D in July 2003 as a member of the graphics team where he is leading the effort on COLLADA. Barnes’ experience and knowledge in the fi eld of visual simulation includes database tools, distributed processing, and real-time graphics.

See more from Mark Barnes

Related Topics

Related Topics

Recent in More

Related Topics

Related Topics

Introduction to COLLADA

Overview

Problem Domain

Separation between Content and Runtime

The Content Pipeline

Problem Description

The Zoo

A Common Intermediate Format

The DCC Vendors

A Collaboration Model

Intermediate or Interchange Format?

COLLADA: A Short History

COLLADA: An Industry Open Standard

Notes and References

About the Author(s)

Latest News

Trending

Featured Blogs

Game Developer Essentials