GDC 2002: Incredibly Dense Meshes
Trends in the advancement of game hardware show that processing speed increases faster than RAM size and DMA bandwidth. Today, especially on PlayStation2, we see that math calculations and triangle rasterization are no longer the bottlenecks: it is availability of RAM and DMA transfer speed that limit our engines. Future systems are likely to have transformation and rasterization so fast that RAM and DMA limits are even more obvious. This talk presents a technique for dealing with "incredibly dense meshes" where one might imagine it to be nearly impossible to move the data around in real time. The technique borrows from many well-known disciplines, including wavelets, subdivision surfaces, and height fields. The result is a method for authoring, storing, and rendering dense meshes that may have one million triangles or more, assuming very fast transformation hardware and comparatively small RAM and slow DMA.
![](https://eu-images.contentstack.com/v3/assets/blt740a130ae3c5d529/bltba62518415cda0e2/652fe6ddbc479f8697ef691f/default-cubic.png?width=1280&auto=webp&quality=95&format=jpg&disable=upscale)
This paper proposes a general solution for a problem that will probably arise in the near future, perhaps with the launch of PlayStation 3 and its counterparts. Presupposed is the exacerbation of a current trend in console hardware, where hardware transformation, lighting, and triangle drawing outpace the hardware's capacity to store and move data. In other words, in the future, transforming vertices and drawing triangles will be virtually free in comparison with storing model data and reading it from main RAM.
For brevity the paper focuses on a solution for character models. On the one hand, character models pose more problems than fixed environment geometry in regard to controlling the animation of vertices with bones. On the other hand, culling and view-dependent level-of-detail need not be considered.
The summary of the problem we propose to solve is as follows: render a very dense animated skinned character mesh in "the future," when transforming vertices and drawing triangles will be cheap, and when loading from RAM will be expensive. (For programmers of the PS2, this "future" will sound more like the present!)
The contribution of this paper is to collect together several already-published conceptions and propose their application to the problem at hand. Some details are given here, but the reader is encouraged to consult the academic papers in the References for more detail.
Later sections discusse the details of the problem, outline several techniques whose integration yields a potential solution, and briefly discuss a possible future without textures.
The Future of Character Models
With the onset of the next "new thing" in game hardware, consumer expectation could very well be an amount of detail equivalent to one triangle per pixel. If screen resolution continues to double with each generation, 1024x768 will be the typical screen size. If, when close to the view, a character model occupies roughly 50% of the screen pixels, and given that the model has half its triangles facing away from the view, then the total triangle count will be nearly 800,000 triangles per character model.
Will transformation and drawing performance really be up to this task? To answer that question, let's look at the advance we saw from PS1 to PS2. The best PS1 character engines seen by the author could render about 20 400-triangle characters at 30Hz, with only single-bone control and no dynamic lighting. The author has seen character engines on PS2 with performance of 31 3400-triangle characters at 60Hz, with up to 3 bones per vertex and with dynamic lighting on all vertices. Even ignoring the added bone control and the dynamic lighting, we have an increase by a factor of more than 25.
So let's assume that transformation and drawing performance will increase again by a factor of 25. In terms of total capacity for a character renderer, PS3 will then have the ability to transform, light, and draw 31*3400*25 > 2.6 million triangles per frame at 60Hz.
If we have a system incorporated LOD control, if we assume the 800,000-triangle density above, and if on average each character occupies only 25% of its near-view screen space, then we would have to render an average of 200,000 triangles per character. This means that in games like team sports, where most of the rendering horsepower is devoted to character rendering, it is perfectly reasonable that we'll be seeing 10 characters in a scene where at maximum detail, each character is 800,000 triangles.
Make the following assumptions about storage technique:
Indexed triangle strips. Each strip vertex has a 32-bit index and 32-bit texture coordinates (16-bits for U and for V).
Each vertex is 32-bit floating-point.
The mesh has a 2-to-1 triangle-to-vertex ratio.
Stripping performance is 8 strip vertices per strip.
With these assumptions, current techniques will result in the following storage needs for each character:
Data | Count | Size |
---|---|---|
Vertices | 400,000 | 1,600,000 bytes |
Strip Vertices | 900,000 (100,000 restarts) | 7,200,000 bytes |
So for the characters in our 10-character scene, we would need to store 83MB of data and mess with it each frame.
While predicting the future is dangerous, it is the author's opinion that future hardware will be able to easily transform and draw meshes of the above size but that it will not be able to store and access that much data. The predicted implications are that rendering techniques that store and move only small amounts of data will be necessary in order to meet consumer expectations of visual quality. Those rendering techniques will have to include smooth LOD control.
Loop Subdivision Surfaces
Loop Subdivision Surfaces provide a solution to the problems predicted above by having the following advantages:
The program need store only a course base mesh with which an algorithm procedurally generates a smooth surface.
The vertices of the base mesh can be animated with bone control, still allowing the rendered mesh to be created procedurally.
Smooth LOD control is implicit in the method as a result of blending between 2 levels of subdivision.
Sharp edges can be included with continuous sharpness control.
Base meshes of any topology can be used.
The Basics
This section gives a very brief explanation of how Loop Subdivision works. For a full exposition of subdivision surfaces, see [7].
Begin with a base mesh. The base mesh data structure must have triangles, edges, and vertices. The vertices store the actual geometry and color content for the model. All features (vertices, edges, triangles) must store information on the connectivity to other features. The exact format of this connectivity depends on the rendering method. To help explain the Loop scheme below, we will use a naïve structure for vertices as follows. For simplicity the structure stores only geometric position and not color information.
typedef struct
{
VERT* parents[2]; // Verts on birth edge.
VERT* across[2]; // Verts opposite birth edge.
ARRAY<VERT*> adjs; // Vertices sharing an edge.
VECTOR pos;