Once upon a time, it was a death wish for a game to be based on a movie license. However, things have changed considerably in recent years. There have been a number of well done and successful game titles based on movies, and on the flip side there have been several movies released that had games as their origin. With the crossover between movies and games finally starting to show some success, it is time to revisit how Hollywood can actually be helpful to the game industry.
In the past century, motion pictures have developed a visual language that enhances the storytelling experience. Equally important, audiences have grown accustomed to certain conventions used to tell these visual stories. Unfortunately, very little of this knowledge has been translated for use in interactive storytelling.
In Part One of this two-part series, we looked at how to describe a cinematic camera shot in general terms so that it could be automatically converted to camera position and orientation within the game. To conclude, this month’s article brings it all together by presenting a system that can choose the best shots and connect them together. Once finished, these concepts can be joined to form a complete basis for a cinematic experience that improves the interactive storytelling of games by giving players access to the action within a game in ways that make sense to them instinctively.
Major motion pictures are made by hundreds of different people all working together in a huge team effort. To transfer the cinematic experience to the world of games, we can take certain established, key roles from the film industry and translate them into entities in the computer. Objects in object-oriented languages such as C++ can conveniently represent these entities. In this article, we will look at the three primary roles and describe their responsibilities as objects. From this, you can build architectures to coordinate real-time cinematic camera displays. Before going into detail about each role, let’s take a brief look at each in turn.
The first job belongs to the director. In films, the director controls the scene and actors to achieve the desired camera shots that will then be edited later. However, because our director object will have little or no control over the game world, this responsibility shifts to determining where good camera shots are available and how to take advantage of them.
Once these possibilities are collected, they are passed on to the editor who must decide which shots to use. Unlike in motion pictures, however, the editor object must do this in real time as each previous shot comes to an end. The editor is also responsible for choosing how to transition between shots.
Finally, once the shot and transition have been decided upon, it becomes the cinematographer object’s task to transform that information into actual camera position and movement within the game world. With this basic idea of how all the roles and responsibilities fit together, we can move on to a closer look at each individual role.
Through the Viewfinder: The Director
As mentioned previously, the director’s role in the game world is to collect information on available shots and their suitability for inclusion in the final display. This is the one place where human intervention is necessary, after which no more human input is necessary. It is currently impossible to create a system sophisticated enough to determine the priority of events within the game world from a creative standpoint.
Instead, programmers and scripters are given the ability to provide information about priority and layout of interesting events, hence the term used in this article — event-driven cinematic camera, through a suggestShot method on the director object. This information will then be used by the editor for a final decision on which shots to include. Following is a breakdown of the information necessary to make these decisions.
The first and most important piece of information is the priority of the shot. The priority represents how interesting a particular shot is compared to other shots available at the time. Thus the value of priority is relative, which means there is no definitive meaning for any particular number. You must therefore be careful to remain consistent within a single game in order to give the priority levels meaning. For example, all other values being equal, a shot with a priority of two is twice as interesting as a shot with a priority of one.
The second piece of information required is the timing of the shot. Timing is the most complex part of the editing process, and the sooner an event can be predicted, the better choices the editor can make. Timing breaks down into four values: start time, estimated length, decay rate, and end condition. The start time is obviously the beginning of the event. The estimated length is a best guess at how long the shot will last. The decay rate determines how quickly the priority decays once the event begins. Finally, the end condition determines when the shot should be terminated. Let’s look at decay rate and end conditions in more detail.
The decay rate is used to determine the actual priority at a given time t using the starting priority p and a constant, k. The constant is provided as part of the decay rate information, since it will differ from shot to shot. The other information for decay rate is the equation to use for determining the actual priority. For maximum flexibility, this should be a function object that takes t, p, k, and the start time, ts, and returns the priority for that time. Two useful functions that should be predefined as function objects for this parameter are:
These functions should suffice for most circumstances. Notice that the second equation cubes the value rather than squaring it. This is important, because it ensures that the priority remains negative after a certain amount of time has passed, whereas squaring would have caused the result to always remain positive. Figure 1 shows the resulting graphs of these functions as a visual aid for understanding how decay rate affects priority.
Figure 1. Decay rate graph, showing how decay rate affects shot priority.
The end condition is best specified as a function object that returns one of three values. The first value indicates the shot cannot be terminated yet, the second value indicates the shot can be terminated if another shot is ready, and the third value indicates that the shot must be terminated. The reason for the middle value is that it gives the editor more flexibility in choosing shots by allowing a choice of new shots within a certain time, rather than instantaneously when the shot is terminated.
Next comes the shot information. This is all the information needed by the cinematographer to change the shot from a suggestion into a real in-game shot. This includes information such as the primary actor and secondary actor, if any. In addition, the shot size, emphasis, angle, and height may be necessary. Refer to last month’s article for more information on determining this information as well as the following scene information.
The scene information consists of the actors within the given scene and the current line of action for that scene. Unfortunately, scene information can change dynamically as actors move around and the cinematographer changes the line of action. Because of this fact, it is best to store the scene as a reference through the primary actor of the shot that is being suggested.
The director’s other responsibilities are to provide the editor with a list of currently available shots at any time and to ensure that this list is up-to-date. Keeping the list up-to-date primarily involves removing shots that are no longer valid. A shot becomes invalid when the priority modified by decay rate, as discussed previously, falls below zero. Once the editor chooses a shot, it is also removed from the list of shots. This brings us to a discussion of how the editor chooses a shot.
Slice and Dice: The Editor
The editor is responsible for choosing the next shot that will be shown as well as any transitions between shots. First, let’s look at the process of choosing the next shot. The majority of the information needed is provided with the shot suggestions from the director, but there are parameters that can be used to give each editor its own style. The two parameters involved in shot decisions are the desired shot length, lshot, and the desired scene length, lscene. By setting these for different editors, the shots chosen will vary for the same set of circumstances. For example, one editor could prefer short scenes filled with one or two long shots by setting the shot time and the scene time to be relatively close values. On the other hand, another editor could prefer longer scenes filled with short shots. This provides a number of options when choosing an editor for a particular situation.
The time for choosing the next shot is determined by a change in the return value of the end condition for the current shot. Once the current shot indicates that it can be terminated, the editor must obtain the list of currently available shots from the director. From this list, the editor object then filters out any shots whose start time is too far in the future. If the end condition requires immediate termination, this excludes all shot suggestions whose start time is not the current time or whose start time has not already passed. Otherwise, all shots whose start time is no more than lshot beyond the current time are considered.
To choose the shot from this list, we must sort them based on a value that represents the quality of each shot suggestion and then take the shot with the highest value. Before we can compute this value, we need to introduce a few other values that will be used in its calculation. First, we consider the desired shot length versus the estimated shot length, lestimated:
Then we look to see if the actors have any relation to those in the last shot:
Next, we check to see if the new scene matches the old scene. For this the editor must also keep track of the time spent in the current scene, tscene:
Finally, the priority is modified by the decay rate discussed earlier if the shot event has already commenced:
Once we have all this information, we can compute the quality value of each shot on the list:
Notice that the values cactor and cscene allow us to maintain consistency for our shots. This is a very important property of good film directing and editing and should not be overlooked in interactive cinematography, even though it is more difficult to maintain.
You may have also noticed that when calculating pw(t) that t can be before ts, thus it is possible under some circumstances to choose a shot that has not started yet. In this case, we hold on to the shot and wait for one of two events: either the shot start time occurs or the end condition of the current shot forces us to terminate. Upon the occurrence of either event, we must once again test to see which is the best shot, in case a better shot has come along or we are being forced to move on before the shot we would like to display can start.
Now that an ordering exists that allows us to choose the next shot, the only remaining choice necessary is the transition from the current shot to the new shot. If we are transitioning between different scenes, the choice is easy, a cut or fade should be used. However, if the transition is between two shots in the same scene, the logic becomes slightly more complex. Within a scene it is important to maintain the line of action; in other words, to keep the camera on one side of a plane defined for the scene so as not to confuse the viewer’s perception of scene orientation.
Let’s consider the various permutations that can occur between shots and what type of transition should be used. For now, we will break them into fading (think of cutting as an instantaneous fade) and camera movement.We will go into more detail on moving the camera later. First, if the actors remain the same between the shots, then we can preserve the line of action and use a fade. Likewise, even if the actors change but the new line of action lies on the same side of the line of action as the previous camera position, then a fade can still be used.
However, if the two lines of action differ significantly, then a camera move needs to be performed. The camera move should allow the line of action to change without confusing the viewer. To get a rough approximation of the distance the camera must travel, compare the distances between the current and new camera positions and the current and new focal points. Now compute how fast the camera must move to travel that distance in the time it would take for the new shot to become uninteresting:
Where Dc is the vector between camera positions, Df is the vector between focal points, and p(t) is the priority decay formula for the shot.
If the camera move cannot be made at a reasonable speed, then a new shot must be chosen, unless the actors from the last shot would not be visible in the pending shot. Otherwise, a new shot should be chosen with preference for close-ups that include only one actor, thus making the next transition easier. We can now move on to realizing the shot and transition. For the decay formulas given earlier, t would be tstart + 1/k.
Lights, Camera, Action: The Cinematographer
Last month, we covered the math necessary to turn a description of a shot into actual camera position and orientation. This month, we will build on that and flesh out the role of the cinematographer by covering the handling of transitions.
The simplest transition is the cut, where we only need to change the camera position and orientation to a new position and orientation. Only slightly more complex is the fade, which provides a two-dimensional visual effect between two camera views. When fading, it is important to decide whether to fade between two static images or allow the world to continue to simulate while the fade occurs. Allowing continued simulation implies rendering two scenes per frame but eliminates the need for pauses in gameplay. If you are able to handle the extra rendering, interesting fade patterns can be achieved by using complementary masks when rendering each scene. Depending on the hardware available for rendering, you may only be able to do black and white masks, or you could go all the way to alpha-value masks.
The other group of transitions involves moving the camera. The three transitions we will consider are pan, zoom, and crane. The decision of which move to make depends on the camera and focal positions for the two shots. Figure 2 shows the various situations that lead to the choice of a particular shot. The pan is used if the camera is in approximately the same location for both shots and only the focal point moves. Though this happens rarely in an interactive environment, when it does happen the old camera position can be kept and only the orientation needs to be animated to the new orientation. Similarly, the conditions for zooming are fairly uncommon, as both the camera positions and focal points must lie close to the same line, but when it does occur the camera field-of-view can be used to allow a much more interesting transition than a simple camera move.
Figure 2. Shot transition criteria, where re is the radius of acceptable error.
Finally, we come to the most complex transition, the crane. The best method for creating a crane move is often by borrowing the services of the AI’s path-planning algorithm in order to avoid moving the camera through objects. It is best if the path planning also handles orientation, as this will lead to better results than interpolating between the focal points.
Unfortunately, getting crane shots to look their best is a complex process for which this is only a starting point. If you do not have the time to invest in making them work, you may wish to leave them out altogether.
Beyond the Basics
You now have enough information to create your own basic cinematic system to include in your game. There is plenty of room to go beyond this basic system. Research on some of these areas has already been conducted in academic circles. For instance, events that involve conversations between characters could be specified as a single suggestion rather than manually suggesting each individual shot during the discourse. “The Virtual Cinematographer” and “Real-time Cinematic Camera Control for Interactive Narratives” (see For More Information) describe how director styles can be created to specify camera shots automatically for these situations. This reduces human involvement, which is always important as it allows other features to be added to the game.
Another important aspect of cinematography that is only now becoming possible with the power of newer graphics hardware is depth-of-field. This is often used as a mechanism to draw attention to various elements in a scene of a film. As rendering of depth-of-field becomes more common, it will be important to develop controls for it that are based on the principles learned from cinematography. It is even possible to extend the concept of depth-of-field in ways that would be difficult in real-world filmmaking. “Semantic Depth of Field” in For More Information talks about selective application of depth-of-field effects on important elements of an image.
As you can see, there is a wealth of information out there and plenty of room for experimentation and new ideas. As games continue to grow in popularity, they must meet the demands of the more general audience that is used to the conventions of films. There is much to do in order to reach this goal and continue to expand the scope of game development. Continued innovation and experimentation in this area will bring out greater variety of expression on the part of game developers, and richer, more compelling game experiences for players.
For More Information
Amerson, Daniel, and Shaun Kime. “Real-time Cinematic Camera Control for Interactive Narratives.” American Association for Artificial Intelligence, 2000. pp. 1–4.
Arijon, Daniel. Grammar of the Film Language. Los Angeles: Silman-James Press, 1976.
He, Li-wei, Michael F. Cohen, and David H. Salesin. “The Virtual Cinematographer: A Paradigm for Automatic Real-Time Camera Control and Directing.” Proceedings of SIGGRAPH 1996. pp. 217–224.
Katz, Steven D. Film Directing Shot by Shot. Studio City, Calif.: Michael Wiese Productions, 1991.
Kosara, Robert, Silvia Miksch, and Helwig Hauser. “Semantic Depth of Field.” Proceedings of the IEEE Symposium on Information Visualization 2001.
Lander, Jeff. “Lights… Camera… Let’s Have Some Action Already!” Graphic Content, Game Developer vol. 7, no. 4 (April 2000): pp. 15–20.