In Parts One and Two of this series, I explained how to dynamically generate and render planetary bodies at real-time speeds using a function based on fractal Brownian motion (fBm) paired with a spherical ROAM algorithm. This article will concentrate on how to scale that up to a star system or even an entire galaxy. It will also discuss some of the problems you will run into with scale and frame of reference, and different ways to solve them.
Problems of Scale: One Planet
The main problem with trying to model and render a really large game world is precision. A 32-bit float has a maximum of 6 significant digits of accuracy, and a 64-bit double has a maximum of 15. To put this into the correct frame of reference, if the smallest unit you care about keeping track of is a millimeter, you start to lose accuracy around 1,000 km with floats and around 1 trillion km with doubles.
Given the fact that the Earth's radius is close to 6,378 km, a 32-bit float isn't even enough to model and render one Earth-sized planet accurately. But losing precision at the millimeter, and possibly centimeter level, with the vertices in a planet's model is not a significant concern. You will run into a number of much bigger problems trying to model and render such a large game world. One possible solution is to use 64-bit doubles everywhere, but this is a slow and rather clumsy way to solve these problems.
When I first rendered my planet centered at the origin of my 3D map, I noticed two problems right away. The first was that placing the far clipping plane out at a decent distance made my Z-buffer useless. The second problem was that at a certain distance, the planet would disappear regardless of what I set the far clipping plane to. The second problem seemed to be driver or card-specific because each video card I tested it on ran into the problem at different distances. Both problems had something to do with very large numbers being used in the transformation matrices.
I solved both of these problems by scaling down the size and the distance of planetary bodies by modifying the model matrix. Using a defined constant for the desired far clipping plane, which I'll call FCP for now, I exponentially scale down the distance so that everything past FCP/2 (out to infinity) is scaled down to fall between FCP/2 and FCP. To make the size of the planetary body appear accurate, all you have to do is scale the size by the same factor you scale the distance. Once the routine was written, I just brought the far clipping plane in until the Z-buffer precision seemed to be sufficient. Because distances are scaled exponentially, the proper Z order is maintained in the Z-buffer.
Problems of Scale: A Star System
Next I tried placing a star at the center of the 3D map and placing the planet and camera out to Earth's orbital distance in the X direction. I immediately ran into rendering problems and positioning problems, though it was hard to tell that it was multiple problems until I fixed the rendering problems. The rendering problems caused all objects in the scene to shake and occasionally disappear whenever the camera moved or turned. Again the rendering problems showed up differently on each video card I tested it on, and again they had something to do with very large numbers being used in the transformation matrices.
Perhaps the most common way to use OpenGL's model/view matrix is to push the camera's view matrix onto the stack and multiply it by each object's model matrix during rendering. The problem with the traditional model and view matrices in the test case outlined above is that both have a very large translation in the X direction. A 32-bit float starts to lose precision around 1000 km, and Earth's orbit is around 149,600,000 km. Even though the camera is close to the planet and the numbers should cancel each other out, too much precision is lost during the calculations to make the resulting matrix accurate.
Is it time to resort to doubles yet? Not yet. This problem can be fixed very easily without using doubles by changing how the model and view matrices are calculated. Start out by pretending the camera is at the origin when you calculate your view matrix. If you use an existing function like gluLookAt() to generate your view matrix, just pass it (0, 0, 0) for the eye position and adjust your center position. Then calculate each model matrix relative to the camera's actual position by subtracting the camera's position from the model's position. The result is two matrices with very small numbers when the camera is close to the model, which makes the problem go away completely. A precision problem still exists with objects at a great distance from the camera, but at that distance the precision loss isn't noticeable.
After all rendering problems have been fixed, you run into precision problems with object positions. Using floats, you can't model positions accurately once you get out past 1,000 km from the origin. The most obvious symptom appears when you try to move the camera (or any other object) when it's far away from the origin. When a position contains really large numbers, a relatively small velocity will be completely dropped as a rounding error. Sometimes it will be dropped in 1 axis, sometimes in 2, and sometimes in all 3. When the velocity gets high enough along a specific axis, the position will start to "jump" in noticeably discrete amounts along that axis. The end result is that both the direction and magnitude of your velocity vector end up being ignored to a certain extent.
Is it time to resort to doubles yet? Yes. I don't think there's any way around it with object position. There's no number magic you can work that will give you extra digits of precision without cost. TANSTAAFL. Luckily, you only need doubles for object positions. Everything else can still be represented with floats, and almost every math operation you perform will still be a single-precision operation. The only time you need double-precision operations is when you're updating an object's position or comparing the positions of two objects. And with 15 digits of precision, you get better precision way out at 1000 times Pluto's orbit than you get with floats dealing with one planet at the origin.
Problems of Scale: An Entire Galaxy
This is a tough one. A double may get you safely out to 1000 times Pluto's orbit, which is just under 2/3 of a light-year, but you really can't take it much farther. Since we don't currently have any built-in data types larger than a double, you have to resort to something custom. I've seen a number of implementations that will work here, but something fast is needed. I've seen custom 128-bit fixed-point numbers created using 2 __int64 values. I've seen 4-bit BCD (Binary Coded Decimal) routines used to achieve unlimited precision. I'm sure if you looked you could even find 128-bit floating-point emulation routines out there.
A common problem with all the schemes I've mentioned so far is performance. Generally speaking, software is much slower than hardware. This means that if you're not using a native data type, all of these custom routines will run much more slowly than double-precision operations. I prefer to solve this problem by using different frames of reference at different scales. The top level would be the galaxy level, with the galaxy centered at the origin and with 1 unit being equal to 1 light-year. The next level would be the star system level, with the star centered at the origin and 1 unit being equal to 1 kilometer.
Because the distance between stars is so vast, you really don't need to mix the two frames of reference. If you consider the fact that anything traveling between stars at sub-light speeds would never get there during the player's lifetime, then you can choose your frame of reference based on whether an object is traveling above or below the speed of light. When an object jumps to FTL (Faster Than Light) travel, you can immediately switch to the galaxy frame of reference. When an object drops back to sub-light speed, you can immediately switch to the star system frame of reference. If a star is nearby, you can choose that to be the new origin. Otherwise, you can make the object's initial position the origin. It is also possible to keep the player from stopping between star systems by forcing them to select a destination star system, then leading the camera there any way you want.
Problems of Scale: Z-Buffer Precision
I've already explained how to expand your Z-Buffer's precision by scaling planets down both by distance and by size. However, this won't solve all your Z-Buffer problems. You'll find that in scaling large numbers down, small changes in distance are lost. A moon that should be behind a planet may be rendered in front of it, or vice versa. Even when you're close to a planet, you can still have triangles break through the Z-Buffer because the far clipping plane is to far away. Bringing it in farther just makes those other problems worse.
Since you can't change the hardware in your video card, and since it's unlikely that chipset designers will provide a 64-bit Z-Buffer, this is a tough problem to solve. I have found that using impostor rendering makes this problem much easier to manage. To create an impostor, you render an object all by itself to the back buffer (or to a separate pixel buffer), then copy (or bind) that object into a texture map. Then the object is rendered as a single rectangle using the generated texture map. Because most objects look different from different viewpoints, the texture map must be updated as the camera moves around. In essence, an impostor is just a billboard that is generated in real-time as needed.
At this point many of you will be wondering how this helps you with Z-Buffer precision. The answer lies in the fact that you must use a special projection matrix when rendering your impostor texture, and this projection matrix has its own front and back clipping planes. This projection matrix creates a view frustum that fits tightly around your object on all 6 sides, which gives you the best Z-Buffer precision you can have for that object alone. Once that texture map is generated for a planet, you really don't need to worry about Z-Buffer precision. Because impostor texture maps are partially transparent, you need to render them in reverse Z order. This means that you can turn the Z-Buffer off completely when drawing the impostored planets.
Problems with Impostors
Impostors not only improve Z-Buffer precision for the objects being rendered as impostors, they also offer great performance improvements. Instead of rendering every planet in your scene every frame, you only need to render two triangles per planet on most frames. Every now and then you will have to update an impostor texture, but even then you will rarely need to update more than one planet's impostor in any given frame, and most frames won't need anything updated at all. Since you're not rendering all those triangles, you also don't need to check their ROAM priorities or update their triangle meshes.
Unfortunately, nothing comes without a price. There are a few problems that can crop up when using impostors. The first problem is that differences between the resolution of the impostor and the screen resolution can cause aliasing artifacts. This can be minimized by choosing an appropriate texture size based on the amount of view angle or screen space taken up by the object. Changing the texture resolution for an object will cause a visual shift that looks like the video card switching between mip-map levels. Because you can control the resolution/size trade-off, this problem is usually acceptable. However, this problem can become really bad when the camera gets so close that the object's edges extend beyond the edges of the screen.
Impostors can also cause problems related to the Z-Buffer. Because you're taking a 3D object and rendering it as a 2D rectangle, you are changing the Z-Buffer values that the object would normally generate. Worse yet, you are changing the Z-Buffer values for the entire rectangle, including the transparent portions of it. This can cause some really bad problems when an impostor gets too close to other objects. The rectangle can end up hiding objects that should be visible. It can even chop objects in half that lie in the rectangle's plane.
Luckily, these problems aren't too bad when dealing with planets. Inter-planetary distances are so large that we really only have to worry about these problems when the camera gets close to a planet. Even then we don't have to worry about objects being drawn on the surface of the planet, or in the planet's atmosphere, as these can all be rendered directly into the impostor if necessary. Still, at some point using the impostor will become more trouble than it's worth. The easiest way to deal with this is to switch from impostor rendering to normal rendering when the camera gets within a certain distance of the planet. In my demo I switch to normal rendering when a planet takes up 90 degrees or more of the field of view.
I have one last thing to mention about impostors. They are also used for rendering clouds, forests, cities, and several other large-scale details that you might want to render on a planet. But care must be taken in how you manage them, or you will quickly find yourself out of video memory. In my demo, I choose impostor resolutions from 512x512 all the way down to 8x8 based on the planet's distance to the camera. If I start to use impostors more extensively, I will need to create a texture cache for them.
Looking at the Code
This demo is getting pretty large, and I don't have time to explain everything I've added to it since the last article. Since the main addition I've made since the last article is impostors, I'll try to explain that piece as well as I can.
Let's start by analyzing the members of the CImpostor class. If you haven't looked at the source code from my previous articles, C3DObject contains position and orientation information about the object itself. It also contains a bounding radius for the object, which I recently added for a number of things like view frustum culling, collision detection, and impostor rendering. For impostors it is used to determine how much screen space the object is taking up, which is then used to determine the resolution of the billboard texture as well as the projection matrix for rendering into that texture. A bounding box or convex hull may easily be used instead of a bounding radius, but for planets a bounding radius gives the best fit.
class CImpostor : public C3DObject
// Static members to hold state information between calls to
// InitImpostorRender() and FinishImpostorRender().
static int m_nViewport;
// These members contain state information regarding the most recent update of
// the billboard texture. They are used to determine how much error is in the
// current billboard texture, and how badly it needs to be updated.
float m_fDistance; // The distance to the camera
CVector m_vOrientation; // The orientation of the camera relative to the impostor's orientation
// These members contain the information used to render the billboard
float m_fBillboardRadius; // The radius of the billboard texture
CVector m_vUp; // The up vector used to generate the texture
CTexture m_tBillboard; // The billboard texture map object
// Other informational members.
short m_nFlags; // A set of bit flags for impostor states
short m_nResolution; // The resolution of the billboard texture
// These are the three main methods.
// To update the billboard, call InitImpostorRender(), render the object, and
// call FinishImpostorRender(). To draw the billboard, call DrawImpostor().
void InitImpostorRender(C3DObject *pCamera);
void DrawImpostor(C3DObject *pCamera);
// Used to determine when the billboard texture needs to be updated
float GetImpostorError(C3DObject *pCamera);
// Helper methods
void GetImpostorViewMatrix(C3DObject *pCamera, CMatrix &mView);
float GetImpostorScreenSpace(float fDistance);
short GetImpostorResolution(float fScreenSpace);
Overall the concept is not too complex, but the math to fit the billboard tightly around the object and to generate the right projection matrix is not trivial. Keep in mind that we need the projection matrix to respect perspective projection or it will not look right, especially when switching between impostor rendering and normal rendering. Here is some pseudo-code with comments for InitImpostorRender(), which should explain the bulk of the math.
void InitImpostorRender(C3DObject *pCamera);
CVector vView = vector from the camera to the center of the object;
m_fDistance = distance between the two (or vView.Magnitude();
m_vOrientation = unit vector pointing from center to camera (relative to object's orientation);
m_vUp = arbitrary up vector (make sure it's not parallel to vView);
CMatrix mModel = model matrix for this object;
CMatrix mView = view matrix for the camera using vView and m_vUp;
if(using bounding box or convex hull)
for(each vertex in bounding volume)
Transform vertex by model/view matrices calculated above;
Divide x and y values by -z (poor man's perspective projection matrix);
Store max and min values for x and y;
// The min and max x and y values define the billboard's rectangle
// For x and y the range -1 to 1 covers a 90 degree field of view
// Use that information to determine screen space and billboard size
// Calculate a projection matrix that tightly encloses the bounding box
else if(using bounding radius)
// See figure 1 below (values d and r are already known)
// h is easy because it is part of a right triangle with d and r
h = sqrt(d*d - r*r); // Pythagorean theorem gives distance to horizon
// a and b are also parts of right triangles, but first we must determine
// the length of the horizon ray up to a and b
cos = h/d; // This is the cosine of angle at camera vertex
ha = (d-r)/cos; // This is the length of the horizon ray to a
hb = d/cos; // This is the length of the horizon ray to b
a = sqrt(ha*ha - (d-r)*(d-r));
b = sqrt(hb*hb - d*d);
glFrustum(-a, a, -a, a, d-r, d+r); // This is fairly straightforward
m_fBillboardRadius = b; // Again, fairly straightforward
// Push impostor projection and model-view matrices onto the stack
// Determine texture resolution and set up viewport
Figure 1 - Impostor view frustum for bounding sphere
Though the bounding sphere requires slightly more complicated geometry, it actually requires less code and is more efficient. Because the view frustum and billboard are perfect squares, they fit nicely into standard square textures. Bounding boxes will have a rectangular view frustum and billboard, and to fit it into a texture efficiently (i.e. without a lot of unused space), the up vector chosen can be much more important. I would not really recommend using bounding boxes because even though they can use the texture more efficiently, they can only help in one of two dimensions, and most textures are square anyway. If you plan to use the rectangular texture extension to try to optimize video memory usage, it may be worth playing around with.
This class is just a simple wrapper class around the WGL_ARB_pbuffer extension. The code is straightforward enough that there's no need to explain it in the article, but some of the concepts behind it need to be explained. First of all, a pbuffer is an off-screen video buffer that you can create with its own rendering context. It's like having an extra back buffer to render to, but it can be any size or pixel format you want. It is not tied to the size and format of the rendering window. When you're finished rendering to this pbuffer, you can copy its contents into a texture object. If your video card supports the WGL_ARB_render_texture extension, you can even bind a texture object directly to the pbuffer (which saves you the overhead of copying it).
Not all video cards support the WGL_ARB_pbuffer extension, and even fewer seem to support the WGL_ARB_render_texture extension. Luckily, it is easy enough to organize your code to use the back buffer when these extensions are not supported. In fact, my demo is written to work this way. Most current video cards should support copying from the back buffer to a texture.
Unfortunately, rendering to the back buffer is usually slower and less convenient than using a pbuffer. This is because you have to call glClear() before you render each impostor. The glClear() call is slower on a large back buffer than on a small pbuffer, and it's more convenient to be able to update an impostor after you've started rendering to the back buffer (which you can't do if you need to clear it for an impostor).
There is one more requirement that might trip you up. Because an impostor texture needs to be partially transparent, your video card must support "destination alpha". This means that it must be able to store an alpha channel in the back buffer (or pbuffer) that you will be rendering into. Without that alpha channel, copying your buffer into a texture with an alpha channel will set all alpha values to 1.0, making your texture completely opaque. When rendering impostor textures, you must also make sure that the alpha component of the clear color is 0.
Although the WGL_ARB_render_texture extension seems like it may offer a decent improvement in performance, it doesn't really work well with impostors. Let's say you create a solar system with 50 planets and moons. This will require up to 50 textures just for the planetary bodies (you may want to use more impostors for clouds, forests, etc.) Since most of these textures will be distant and can be made fairly small, you shouldn't run out of video memory creating that many textures on today's video cards. However, you're not allowed to create that many rendering contexts, which means you can't create that many pbuffers to bind directly to texture objects.
It is possible to store many small textures inside one large one. Many games merge all their smaller textures into one large one to avoid unnecessary context switching in the video card. For impostors, you could do the same in one of two ways. First, you could render impostors into pbuffers of a certain size, then copy the pbuffer to a specific location of a larger texture. Second, you could create a large pbuffer and use glViewport() to render into a small part of the pbuffer. The second idea might allow you to use WGL_ARB_render_texture efficiently. One possible drawback is that glClear() seems to run slower if you're not clearing the full buffer, and it may actually be quicker to copy than to bind for smaller impostor textures. I suppose the only way to find out for sure is to take the time to test it.
We now have a demo capable of rendering an entire solar system fairly efficiently. Planets that aren't close to the camera aren't updated or fully rendered every frame, and inter-planetary distances should not cause any scale or depth problems. Believe it or not, the only depth problem you may have now is when you get too close to a planet to render it as an impostor, but the horizon is still pretty far away. You can alleviate this problem by changing the distance at which you switch to normal rendering, changing the function that scales distance and size, and possibly by dynamically moving the near and far clipping planes based on the horizon distance. The last option will give the best Z-Buffer precision when the camera is close to the ground, which is when you are likely to need it most.
I've also added some classes to organize planets and moons into a solar system, along with code to load the solar system from an INI file. The planets don't move at this point, but it shouldn't be too hard to add orbits and motion. The demo project comes with an INI file defining 4 planets with a single moon orbiting the third planet. For now I divide orbital distances by 10 when loading the planets to make them easier to see from each other.
At this time, I don't have any plans to write another article for a while. If you have any questions, comments, or ideas, feel free to drop me an email (see the Author's Bio link above). If you're working on something similar, or using my code in a project of your own, I'd also be interested in hearing from you.
The Virtual Terrain Project: A one-stop site for all your terrain/world modeling and rendering needs. If you liked my articles, then you need to check this site out.
Delphi3D Impostor Article: A short but informative article on impostors. I couldn't find it, but source code may be available in Delphi using OpenGL.
SkyWorks Cloud Rendering Engine: A real-time volumetric cloud rendering engine using impostors. Check out the publications as well as the source code.
Efficient Impostor Manipulation for Real-Time Visualization of Urban Scenery: A publication explaining the use of impostors to render large urban areas.
Real-Time Tree Rendering: A publication explaining the use of impostors and multi-resolution rendering techniques to render detailed trees.