Sponsored By

Culling (sometimes called hidden surface removal) serves to reduce engine workload resulting in higher framerates. This excerpt from Programming a Multiplayer FPS in DirectX explores a variety of culling methods.

April 11, 2005

13 Min Read

Author: by Vaughan Young


One of the most important aspects of scene management is rendering the scene. We are not going to discuss rendering the scene just yet; instead, we are going to discuss how to avoid rendering the scene. That's right, we want to look at methods that will allow our engine to avoid rendering as much of the scene as possible. Why do want to do this? Quite simply because the less we render, the less work our video card has to do, and the less work the video card has to do, the better.

A game's performance is often measured by how many frames per second it can achieve in a given situation. This is often referred to as a benchmark. We measure performance in this manner because it gives us an indication of how well the underlying engine is handling the scene. When the engine has to process and render a lot of data, the frame rate drops. On the other hand, when the engine processes and renders very little, you will notice an increased frame rate. Therefore, it makes sense to introduce techniques into the engine that minimize the amount of data that the engine has to process and render for each frame.

Measuring a game's performance by frame rate alone is technically inaccurate. The reason for this is that a drop in frame rate from 100 fps to 90 fps is not the same as a drop from 40 fps to 30 fps. Instead, you should look at the change in frame rate as a percentage. Therefore, changing from 100 fps to 90 fps is a 10% performance hit, whereas changing from 40 fps to 30 fps is in fact a 25% performance hit, which is much worse.

One of the most common methods of elevating the strain on the video card is through culling (or sometimes called hidden surface removal ), which is a general term used to define the filtering of faces that do not require rendering. What this means is that any face in the scene that does not need to be rendered in a particular frame, does not need to be pushed down the DirectX graphics pipeline. This does not mean that the face does not exist, or is removed from the scene. It simply means that the engine does not send the face to DirectX to be rendered. The real question is, how do we determine if a face needs to be culled?

There are a number of common methods we can use to perform culling, with some being more complicated than others. The most effective solution usually combines two or more methods to ensure the most accurate culling results. The reason for this is that each method has its strengths and weaknesses. Some methods are computationally fast, but are relatively inaccurate, while others are slower, but yield better precision. It is usually best practice to employ a fast and rough culling method to cull away most of the non-visible faces (i.e., faces that cannot be seen by the viewer in the current frame). Then we follow this up with a slower, more accurate method that removes the extra faces not culled away by the fast method.

Developing an effective culling algorithm is like a fine balancing act. Your computer has two processors (the CPU and the GPU) that you must keep busy at all times. However, you don't want to let one of them become overloaded, therefore creating a bottleneck. The more culling you perform, the more strain you put on the CPU. On the other hand, the less culling you do, the more strain you put on the GPU. Figure 10.1 shows the relationship between culling and performance. You can see that as we perform more culling on our scene, the better the performance, up to a point, after which performance begins to drop off. This happens because the CPU is causing a bottleneck due to the amount of processing involved with the elaborate culling algorithm. You should also notice from Figure 10.1 that culling is affected by diminishing returns. This means that the more you do, the less of a performance increase you gain. The idea is to find the perfect balance among the amount of culling the CPU performs, the amount of geometry the GPU renders, and the amount of work you have to do to achieve satisfactory results.

Figure 10.1 The relationship between culling and performance.

As previously mentioned, there are a number of common culling methods in use today. Table 10.1 shows three common methods used to cull hidden faces, as well as the advantages and disadvantages.

For our culling system, we will use all three of the methods shown in Table 10.1. We have already touched on back face culling in Chapter 5 so you should recall what that is. Fortunately for us, DirectX has back face culling built into it, and using it is as simple as switching it on. In fact, it is on by default, so that makes it even easier. The second two culling methods are new and they require a little discussion. We'll start with frustum culling.




Back Face

DirectX can perform this for us. It is just a matter of switching it on.

None really. The performance gain justifies the addition of this extra step to the graphics pipeline.


Relatively easy to implement. Can be quite computationally inexpensive.

Fairly inaccurate. Many faces close to the view will be rendered.


Perfect for removing surfaces in view that are hidden behind something.

Can be a significant burden on the CPU in complex scenes.

Table 10.1 Culling Methods

In the last chapter, we talked about the view and projection matrices. We know that the view matrix defines the virtual camera used to view the scene, and the projection matrix acts like the lenses of the camera to control the projection of the 3D scene onto your flat monitor screen. If you combine these two matrices (i.e., multiply them together), you can derive a new matrix that represents the field of view (FOV). The FOV simply defines what can be seen by the camera. To better understand this principle, you can look at the FOV of your own eyes. While keeping your head and eyes still, facing straight ahead, extend your right arm out to your side and move it back until you can no longer see it in your peripheral vision. This means that your arm is no longer in your FOV. If you move your arm back in slowly, it will eventually come back into your FOV when you can see it again in the corner of your eye.

We will use this exact principle to cull faces that are not in the camera's FOV. To achieve this we need to create what is called a frustum (or a view frustum ) from the camera's FOV matrix. You can imagine a view frustum like a pyramid on its side, with its apex positioned at the camera and its base extended away in the direction of the camera's facing. Figure 10.2 shows what a view frustum looks like.

We won't look at the actual implementation details just yet. Instead, let's look at how the view frustum can be used to cull the scene. Once a view frustum has been calculated, we can define it using a set of planes . A plane (which is provided by the D3DX library through the use of the D3DXPLANE structure) is like a flat 2D surface, which extends infinitely through 3D space; it has no thickness and it has no boundaries. It has two sides (a positive side and a negative side), and every vertex in 3D space is located on either one of these sides. Our view frustum can be defined using six of these planes (however, you can get away with only five if you ignore the near plane, which is what we will do in our implementation). If you look at Figure 10.2 again, you can see that we need a plane for each of the sides of the pyramid.

FIGURE 10.2 A view frustum, based on the virtual camera's field of view matrix.

The planes are not enclosed by the actual shape of the sides of the view frustum pyramid, as they extend indefinitely along their axes. The shape of the view frustum is enforced by the fact that the planes intersect each other, therefore creating a pyramid-shaped box in 3D space. Once we have this box we can test if faces are visible by checking if they are on the inside or the outside of this box, which is quite easy to do. Figure 10.3 shows how planes are used to define a physical view frustum in 3D space.

Now that you understand how view frustum culling works, you shouldn't have too much trouble understanding how occlusion culling works as it relies on the same principle. Occlusion culling basically means to cull the scene that is hidden behind occluders. So what is an occluder? An occluder can be any object that has the ability to conceal parts of the scene from the viewer. A typical example is a large building or a solid wall. If the player is viewing the scene and a large solid wall hides a good portion of the player's view, then it makes sense to cull anything that is behind that wall. Figure 10.4 illustrates the example.

Figure 10.3 The top and front of a view frustum defined by infinite planes.

FIGURE 10.4 A scene that is partially hidden due to the occluders.

How do we determine if something is hidden by an occluder? We use the same principle as discussed for frustum culling. All we need to do is create a frustum that extends from the occluder, away from the viewer. Then, rather than culling everything that is outside the frustum, we cull everything that is inside the frustum. Figure 10.5 shows how frustums are used for occlusion culling.

Figure 10.5 Shows how frustums are used for occlusion culling.

With these three culling methods we are able to cull a good portion of our scene and prevent many of the hidden surfaces from being sent through the DirectX graphics pipeline. You should be aware that this system is not perfect; there will still be a number of faces that are rendered each frame that don't need to be. However, it is not necessary to have a perfect system. We must keep the balance in mind and prevent our CPU from becoming a bottleneck from over culling. Figure 10.6 gives you an idea of the kind of returns each of the culling methods will give us. As you can see, the more culling we do, the less effective it is.

You should also note that the effectiveness of our culling methods is also strongly linked to the data set we give it to cull. In other words, what is the system trying to cull? Individual faces, groups of faces, whole objects?

Figure 10.6 The effectiveness of our culling methods


This article is excerpted from Programming a Multiplayer FPS in DirectX. (ISBN # 1-58450-363-7). For more information about the book, please visit http://www.charlesriver.com/Books/BookDetail.aspx?productID=91312.


Read more about:

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like