Let There Be Light!: A Unified Lighting Technique for a New Generation of Games
This article presents a unified per-pixel lighting solution capable of handling an arbitrary number of dynamic lights, and techniques for optimizing such a solution.
Game developers have always strived to provide a greater sense of realism to the player in order to further immerse them in the worlds they create. With the coming of vertex and pixel shaders, a new power became available that allowed developers to advance this goal by applying complex lighting and advanced visual effects to the scenes they created. In the last generation of gaming (i.e. the Xbox/PS2/GC era), dynamic lighting took a giant leap forward thanks to programmable vertex and pixel pipelines. Most games now support lighting calculated on a per-vertex basis, with some later entries into the market offering the more esthetic per-pixel lighting solution. Now it's time to make another leap: from the vertex to the pixel shader.
Modern GPUs (Graphics Processing Units), and most notably those that will power the next generation of game consoles, offer an incredible amount of processing power. It is now possible to have all lighting in a scene computed at the per pixel level using a variety of different techniques. While it is true that per-pixel lighting has been available through past shader models (SM), the introduction of SM3.0 has permitted developers to remove the bounds on the number of lights that could traditionally be calculated in a pass1.
The goal of this article is to present the reader with a unified per-pixel lighting solution capable of handling an arbitrary number of dynamic lights, and techniques for optimizing such a solution should it be adopted. We will start by taking a look at the current state of lighting techniques and examining their limitations. The unified model will then be explained in detail and its limitations discussed, along with optimizations that can be made to ensure that application runs at a solid frame rate.
Throughout the article, various bits of shader code will be presented to help explain implementation details. To this end, a test scenario has been established to illustrate the various limitations of existing lighting techniques, and how the unified lighting solution can implement the same scenario with superior results. The test scenario will consist of a static non-skinned object being affected by a number of lights (i.e. an environment). All light sources are assumed to be point lights. Adding code to support other kinds of lights as well as the specular lighting model is left to the reader. The lighting equations are readily available on the Internet for those interested.
For readers unfamiliar with lighting equations, a point light's contribution is computed as follows:
Given: N, the normal of the point on the surface to be lit
Ppoint, the position of the point to be lit
Plight, the position of the light
A0, A1 and A2, the attenuation factors for the light2
(1) dist = |Ppoint- Plight |, the distance from the light
(2) Lpoint = Ppoint- Plight /|Ppoint- Plight |, the normalized direction from the light to the surface point
(3) att = 1.0/( A0 + (A1*dist) + (A2*dist2) ), the attenuation of the light
and finally,
(4) Cpoint = att * (N · -Lpoint), the color contribution to the overall lighting
For this article, we will simply be using inverse linear drop off as our attenuation factor. Inverse linear attenuation is 1/dist; it is equivalent to setting A0 = 0.0, A1 = 1.0, and A2 = 0.0.
Overview of current lighting techniques
Vertex Lighting
The most common form of lighting in games today is vertex lighting. This is the fastest form of dynamic lighting presented in this article. Vertex lighting works much like it sounds; the color contribution of each light is calculated for each vertex on the surface and interpolated across the surface during rasterization.
For a quick lighting fix, this solution is robust and inexpensive. However, this method does have its drawbacks, which will be explored later in the article. In the meantime, see listing 1.1 for an abbreviated example of vertex lighting in HLSL.
Per-pixel normal map based lighting
Normal map lighting takes a different approach to lighting by encoding tangent-space normals3 for the surface in a texture to compute the lighting equation at each pixel, rather than at each vertex. Object space normal maps are also possible and are generally used to light dynamic objects. This form of bump mapping has quickly become the standard for games that want to push the graphical limits. Most new games rely on this as their primary lighting technique because it allows artists to achieve incredible levels of detail while still keeping the polygon count low. There is also a variation on normal map lighting called parallax mapping which encodes an additional height map value into the normal texture in order to simulate the parallax effect. You can find more information and implementation details about it at both ATI's and Nvidia's developer sites.
Performing normal map lighting is a three-step approach. The normal map must first be created, applied to the model and exported with tangent space information4. Next, when processing the vertices of the surface to be normal mapped on the vertex shader, a tangent matrix must be created to transform all positional lighting information into tangent space (all lighting equations must be performed in the same coordinate space). The tangent space matrix is a 3x3 matrix made up of the vertex's tangent, binormal and normal vectors. The binormal vector is obtained by computing the cross product of the tangent and the normal. Finally, the color contribution of each light is calculated in the pixel shader using the normal information fetched from the normal map and the tangent space lighting vectors computed from data transformed on the vertex shader. See listing 1.2 for a simple HLSL implementation of normal map lighting.
Limitations of existing lighting models
Though vertex and normal map lighting have been extensively used during the past 5-6 years, they are not without their flaws. Before addressing the unified per-pixel solution, it would be good to understand why a new lighting model is required.
Interpolated lighting (Vertex lighting)
While vertex lighting is quick and effective most of the time, there are certain scenarios in which the results obtained are less than ideal. Take for example a point light affecting a large polygon made up of two triangles. Because vertex lighting works by interpolating the colors attained at each vertex, this scenario would result in the quad being equally lit across its entire surface (or not lit if the point light doesn't reach the edges- figure 1). In order to get around this problem, the quad would have to be tessellated in order to achieve a falloff from the center to the edges (figure 2). This is counterproductive for the art team and is a problem that can be easily rectified using a per-pixel lighting approach.
Light count restrictions ( Normal map lighting)
Normal map lighting manages to avoid the problem inherent to vertex lighting, but it has its own limitations. The problem stems from the fact that all lighting calculations must be performed in the same coordinate system. Current implementations of normal map lighting have the vertex shader transform the lighting information required into tangent space and the color contributions computed on the pixel shader. The issue is obvious: the number of light sources that a surface can be lit by is limited to the number of registers the vertex shader can pass to the pixel shader.
Consider the case where we want to pass our lighting information through the texture registers. On a GeForce 6800, it's possible to pass 10 vectors of 4 floating point entries. Assuming only the normal map, a regular texture and only directional lighting, that's a maximum of 8 lights applied to the surface at any one time. Many must be thinking "well that's enough isn't it?" at this point. And yes, 8 lights on a surface could be considered more than enough to achieve a well lit and realistic looking object. But consider this: point lights will require 1 extra register to store a world position in order to calculate attenuation and direction (meaning 7 lights), spot lights require 2 registers per light (setting the limit to 4 lights). Mixing the various forms of lighting only makes matters worse, and the subject of environment maps, secondary textures, and other complex effects which will be present in next generation games hasn't even been brought up yet. Given complicated effects like those that will be required in the future, it's easy to wind up with only one or two registers available.
Unified per-pixel lighting solution
The proposed solution to vertex and normal map lighting is actually a simple one. Upload all world space lighting information directly to the pixel shader and perform the lighting calculations on a per-pixel basis.
Implementing this solution requires slightly different approaches based upon whether or not a normal map is present. The following will examine how to get a world space normal onto the pixel shader when no normal map is present, and how to get the tangent space normal into world space when there is. Both solutions require the same data sets as their predecessors (vertex lighting and normal map lighting). With this unified lighting solution, the actual lighting code will always remain the same regardless of whether or not a normal map is present. This makes it practical to set up fragment shaders that compute the normal based on the scenario and then pass it on to the lighting fragment. The lighting shader code required for both implementations can be found in listing 2.1.
Interpolate normals, not colors
The first solution examined is that which addresses the issues of vertex lighting. Let's go back to the large surface example that was given earlier. Because the colors of each vertex were being interpolated across the surface of the polygon, and because so few vertices were used, the end result was a big quad with a more or less even lighting distribution; or none in this case. To get around this problem (while still using the same number of vertices), we have to light each pixel individually so that the lighting distribution across the surface will falloff properly according to whatever attenuation we've specified (inverse linear in this case).
To do this, we must get the surface normals, which are normally transformed and used locally in the vertex shader, onto the pixel shader. It turns out this can easily be done by taking advantage of the fact that the hardware will interpolate data across the surface when feeding the pixel shader. Instead of calculating the color for the vertex, we will now simply transform the normal into world space, and then place it into a register alongside our emissive color for treatment on the pixel shader. Also, since we are doing point lighting, we will also need to send the world space position of the vertex across the shader so we can get an interpolated world space pixel value. We then simply perform the lighting calculation on the pixel shader much the same way that we did it on the vertex shader. See listing 2.2 for an HLSL implementation and figure 3 for the visual result.
Perform normal map lighting in world space
The second solution addresses the light limitation issues inherent to current normal-map lighting techniques. Again, the solution is to light in world space; that way, we can upload as many lights as we care to. Well, we are still bound by the number of constant registers that can be uploaded to a pixel shader, which is 224 as of the April 2005 version of DirectX 9c, but that should be more than enough to achieve whatever complicated effect an artist can dream up.
Getting the tangent space normal into world space still requires that we build a tangent space matrix using the tangent, binormal, and normal vectors of the vertex. However, instead of using this matrix to convert data into tangent space, we will compute the inverse tangent space matrix (we can use the transpose instrinsic function to do this if our matrices are orthogonal - which they can easily be by imposing a few limitations on the artistic team) and multiply that with the world space matrix. This will allow us to transform the tangent space normal into a world space normal that we can then use to perform all of our lighting calculations.
Given the tangent space normal (NT) from the normal map, we can derive the world space normal (NW) as follows:
(NT)( T -1 W) = N(T T -1)W = N(I)W = NW (normal in world space)
For the shader implementation, please see listing 2.3.
Note that there is some precision error injected in the resulting world space normal by using this solution. This stems from the fact that we are re-creating a tangent space matrix from interpolated vertex data, as opposed to the actual texel-based tangent space matrix that is computed during normal map generation and used to transform the normal. For most cases, this solution will work perfectly. But if a scenario comes up where a set of vertices share a normal, but not tangent information, shading seams will show up on the geometry (thanks Morten Mikkelsen for pointing this out).
Getting the most out of per-pixel lighting
These new approaches to lighting will be very pixel heavy, and therefore require their own optimizations to make rendering as fast and efficient as possible. This will be the focus of the following few sections.
Break out of the shader as soon as possible
With shader model 3.0, we have full support of dynamic flow control for the pixel shader. This needs to be used to its fullest to save on time spent in the shader. However, in order to make the most of the hardware, some restructuring of our code will be in order.
While pixel shader 3.0 technically supports looping, the loop increment register (aL) is currently only usable with the input registers (v0-v10 as of DirectX 9c April 2005), so all of the loops in the sample shader code actually get unrolled and all lighting calculations are performed. If lighting on the object is constant, and the number of lights being used is minimal, this isn't that big a problem. However, if the light configurations can change from frame to frame, and the light set is large, this will become a significantly slower shader. What's worse is that when dynamic looping is used (loop based on a variable passed into the shader), the compiler will process every light possible with its own if-else test to see whether or not the calculation must be performed. Note that this issue is being addressed on the next generation consoles, however I am currently unsure whether this is being addressed for PC graphics hardware at the moment.
In order to properly branch out of such cases, the shader code will unfortunately have to be re-written to contain nested if-else statements that can be branched out of as soon as an end condition is met. To better illustrate this point, here is a simple example.
Let's assume there are 6 point lights, and that any of those lights can be switched on/off at any moment. Every frame we will see what lights are active and build up the arrays of data to pass to the shader. We will also build an array of booleans that we can use to opt out of the shader (as long as there is light data, there is a boolean set to true. Once no more data is available, the next boolean is set to false). See listing 3.1
Listing 3.1: Uploading required information for lighting |
The reason we use the boolean constant registers is because they take less space overall. Each entry is a bit. Mind you, there are only 16 of them available at this time, so if using more light sources, a different solution will be needed. But the principle remains the same.
Now that we have our information on the shader, a simple code restructure will ensure that we can quit the pixel shader as soon as we hit the end of our light list. Listing 3.2 has some code representing what the new shader looks like. Note that this solution can support 24 active lights of each type (static nesting has a depth of 24 as of DX 9c April 2005).
Now we have a shader that will allow us to opt out when we want, but we are left with another problem; this shader is much bigger and heavier than what we had previously and will take longer to execute. So.
Early rejection is key
The last point to mention when talking about optimizing pixel shaders is this: performing no pixel calculations is better than performing little. Wherever possible, try to avoid overdraw or calculating pixels that cannot be seen. Good solutions for these issues are basic sorting of the scene to be rendered and culling back facing surfaces. Some hardware allows rapid z-only passes where objects can be written to the z-buffer without doing any pixel processing, so that a second pass can be performed later and only the pixels that pass the depth test will be computed. In this way, it is easy to achieve very little overdraw and maximize the performance. Considering that next-gen games will require high-definition resolutions of 1280x720, every pixel skipped helps overall shader performance enormously.
Conclusion
Development costs and pricing aren't the only things that are going to rise with the coming of the next generation. Effect complexity and the shaders that drive them will be taking a massive leap, paving the way for "super shaders" that can perform a multitude of effects in a single pass. This article has shown that in such a context, it is possible to have a unified lighting solution available without the need to normal map everything in sight. A lot of research is still being performed on the subject of lighting, and this is simply one of the many approaches that exist.
Special thanks to:
Morten Mikkelsen
Stephen Mulrooney
Don Dansereau
Paul Richardson
End Notes
Mathematics of lighting - Attenuation and spotlight factor
Derivation of the Tangent Space matrix, Jakob Gath and Søren Dreijer
_____________________________________________________
Read more about:
FeaturesAbout the Author
You May Also Like