Sponsored By

More AI in Less Processor Time: 'Egocentric' AI

Modern games that are both complex and graphic intensive will increasingly make computational demands on hardware and programmers. Fortunately, Ian Wright and James Marshall are at hand with techniques to control and manage real-time AI execution, techniques that open up the possibility for future hardware acceleration of AI.

James Marshall, Blogger

June 19, 2000

26 Min Read

The design brief for the new game's AI has just been handed to you, and to call it optimistic would be an understatement. You are charged with developing a real-time living, breathing city, populated by thousands of pedestrians, hundreds of cars, and dozens of non-player characters. The 'incidental' pedestrians and traffic need to react convincingly to each other and to your actions, while the NPCs absolutely positively must act in a believable manner when you encounter them. It's going to be computationally expensive, but you've only been given 20% of the processor time each frame, and if you exceed that and the game frames out, you've failed.

Modern games will increasingly make such demands on hardware and programmers. Fortunately help is at hand with techniques to control and manage real-time AI execution, techniques that open up the possibility of future hardware acceleration of AI.

Games Should Be Fun

Games should be fun. This requirement has many consequences. One important consequence is that games that allow player input at any moment ("arcade-style" games) should run in real-time, presenting events that occur sufficiently fast to challenge the player's reactions. Lower frame-rates look bad, reduce the opportunity for interaction, increase player frustration, and do not reflect the speed of events in the real world. With this firmly in mind, we set out to design a framework for the general execution of AI code.

Latter stages of a game project involve optimising parts of game code for processing time reductions. This includes AI code, which, depending on the type of game, can take up more or less of the available CPU time. Given this, an important requirement for general AI execution is that (a) it conforms to the timing constraint of the overall game frame rate. A consequence of (a) is that the AI never exceeds a maximum per-frame processing time.

AI requires the execution of arbitrarily complex and heterogeneous pieces of code, often grouped together as behavioural "rules" or "behavioursets" for various game objects or agents, such as the AI code for a trapdoor, obstacle, spring, or the code for an adversary, racing vehicle or character. Therefore, a further requirement for general AI execution is that (b) it makes no assumptions about the exact nature of the AI code, including assumptions about how long the code will take to execute.

Rendering code normally has to execute every frame in order to construct the visual scene. The situation is different for AI code. Consider a soccer player, who may need to check for passing and shooting opportunities every frame, but only need check its position against the team's formation every other frame, or only in a dead-ball situation. AI code involves a wide range of execution frequencies compared to non-AI game code. If all AI code is fully executed every frame when this is not required then the resulting code is inefficient. Also, some games require different execution frequencies for objects and agents, in addition to controlling the execution frequencies of their internal processes. For example, a very slow moving tortoise need not be processed every frame, whereas the hare may need to be. Hence, a further requirement for general AI execution is (c) it allows different execution frequencies to be specified both for agents and their constitutive internal processes.

Finally we realised that some AI processes can be extensively time-sliced across many frames, particularly if the results of the process are not immediately required. For example, if a strategy game agent needs to plan a route through a terrain, then the planning can potentially take place over many frames before the agent actually begins to traverse the deduced route. Time slicing allows computationally expensive processes to be 'smeared' across many frames thereby reducing the per frame CPU hit. Therefore, a final requirement for general AI execution is (d) it allows AI processes to be dynamically suspended and reactivated.

There are no general methods for supporting different execution frequencies of parts of AI code and time-slicing non-urgent AI processes. If these techniques are employed they are employed in a project-specific, ad-hoc manner. There is no 'AI operating system' that allows programmers to control these factors. This represents an important missed opportunity for the development of more complex AI in games. If all AI code were executed through a common AI operating system or engine, with mechanisms for specifying execution frequencies, upper bounds on CPU time, time-slicing, and suspension and reactivation of processes, then it would be possible to get more AI for the same CPU power.

To recap, here are four main requirements for general-purpose AI execution:

 

  1. Conformity to the timing constraint of the overall game frame rate despite variable AI load;

  2. No assumptions about the exact nature of AI code;

  3. Allow different AI processes to have different execution frequencies, both for agents and their constitutive internal processes;

  4. Allow AI processes to be dynamically suspended and reactivated.

By now you may have realised that (a) asks the impossible: AI algorithms that take the same amount of time even when asked to do more work. However, games must entertain the player, not implement a perfect simulation. In the next section we'll look at why we can partially satisfy requirement (a).

Believability Vs Accuracy

An arcade-style game is somewhat like the real world, consisting of both active and passive agents and events that unfold over time. But the game need not process every object, agent and event in the virtual world in order to present a believable, entertaining experience. For example, if a truck is within the player's field of view when planting a mine then the game necessarily needs to process the truck movement and the mine drop, and the rendering code necessarily needs to draw this event to the screen. However, if the truck is 'off-screen' the rendering code need not be run, and the AI code controlling the truck could simply assert the existence of a mine on the road at a certain time, rather than processing the fine-grained movement of the truck. Virtual game worlds need to present a believable world to the player, and not necessarily present an accurate simulation of the real world. Events not 'interactively close' to the human player need not be fully processed. Therefore, requirement (a) can be satisfied if some AI processes need only be "believable" rather than "accurate". These kinds of processes can be time-sliced over many frames, executed at a lower frequency, be replaced with computationally less expensive "default" behaviours, or simply postponed. Furthermore, what may need to be "accurate" at one time may need to be only "believable" at another, depending on the current activities of the human player. We call the idea of prioritising the update of parts of the game world currently most relevant to the player "egocentric processing". Our Process Manager implements this idea.

The Process Manager

The Process Manager (PM) meets the requirements (a) to (d) identified above. It is implemented in C++ and runs on Sony Computer Entertainment's PlayStation2.

The PM is a simple operating system for game AI code. The PM allows users to specify an upper time limit for AI processing. If the limit is exceeded the PM attempts to alter the timing of processing to meet the upper limit. The PM provides a common execution framework for active and passive objects in games. When there is sufficient time in the current frame to complete all scheduled AI processing, the PM acts as a normal scheduler or "task system". But if there isn't sufficient time the PM starts to behave very differently from a normal scheduler and begins to employ the "egocentric" principle. Under high loads it maintains a constant AI time by postponing or avoiding the less "important" computational work, where "importance" can be defined by the user (e.g., more "important" if nearer the main camera).

This functionality is provided via several mechanisms:

 

  1. Specification of a global per-frame time limit for AI processing, the "AI time". (Requirement a)

  2. Allocation and control of available execution time between agents and AI behaviours according to user specified priorities, reducing the need for manual optimisation of AI code. (Requirement a)

  3. "Dynamic activation delay" of newly created agents to evenly distribute AI processing over frames. (Requirement a)

  4. Simple method invocation of the list of AI processes as defined by user. (Requirement b)

  5. Control of process execution frequency and frame-by-frame interleaving of agents and their constituent behaviours. (Requirement c)

  6. Control of execution order of agents and AI behaviours. (Requirement c)

  7. Time-slicing of AI processes. (Requirement d)

Together these mechanisms provide users with a high level of control over the time profile of their AI code, and also an ability to maintain a constant frame rate despite a variable AI load.

The PM assumes that each agent consists of the following, possibly empty, sets: perception sets, behavioursets, and action sets. Each perception set contains methods to translate game state into an internal agent representation, a behaviourset contains methods that create and modify the agent's internal representations, and an action set translates internal representations to changes to game state. The decomposition of an agent into sensing, acting and thinking parts does not impose any restrictions on the type of AI code. For example, a road cone agent in a racing car game may have an empty perception set, a single behaviourset that performs physical calculations based on the cone's position and velocity, and a single action set that updates the velocity and position based on those calculations. The racing car itself, however, may be more complex. For example, it could have a perception set that updates an internal representation of the car's position, and the position of competitors; a behaviourset that contains methods to calculate steering, brake and acceleration inputs based on current perceptions; and an action set that translates the inputs to update the car's velocity and position. Behaviours within behavioursets, perceptions within perception sets and actions within action sets are the indivisible code "atoms" that the PM executes.

The Time Allocation Algorithm

The Process Manager is called every game cycle with a single parameter that specifies the AI time. The PM holds a user-defined AgentList, which is the list of currently active agents. When invoked the PM executes all the currently active agents and attempts to ensure that this processing does not exceed AI time.

The PM allocates AI time between active agents, according to the user-specified priority of each agent. In turn, each agent allocates its share of AI time between its constituent behavioursets that are active in the current frame, also according to user-specified priorities. The time allocation process is shown in Figure 1.

wright_01.gif

Figure 1. Allocation of available processor time

The PM's functionality is distributed between the PM class and the agent class. The PM provides a base class for the PM and agents, which provides the necessary functionality at each level.

Each agent and behaviourset is allocated a portion of AI time as follows:

wright_eq_01.gif

Where

ti = CPU time allocated to agent/behaviourset i
t = AI time
n = number of agents/behavioursets to be executed
pi = priority of agent/behaviourset i (in the range 1-100)

 

The pseudo-code in Listing 1 illustrates the execution of the PM.

Listing 1. Process Manager Execution

// scheduling
determine active agents this frame
calculate time allocation for each active agent

// execution
for each active agent on list

// execute perception set
determine active perception methods this frame
execute active perception methods

// scheduling
determine active behavioursets this frame
calculate time allocation for each active behaviourset

// execution
for each active behaviourset on list
if behaviourset was terminated on last execution
if behaviourset is suspendable
restart behaviourset execution
else if behaviourset is sliceable
continue behaviourset execution
end if
else
run behaviourset
end if

check time
if no time remaining exit behaviourset execution
end for

// execution action set
determine active action methods this frame
execute active action methods

check time
if no time remaining exit PM
end for

Specifying a Game Agent

In order for the PM to control AI execution, certain execution parameters are required. The execution parameters required for agents and behavioursets are described in table 1.

Table 1. Execution parameters

Execution Parameter

Execution priority

Execution period

Execution method

Activation delay

 

Process Peaking

Process peaking occurs when AI processes are scheduled to execute on the same frame at regular intervals. For example, consider that three behavioursets are initialised at the same time, with execution periods of 4, 6 and 8. In consequence, every 8 and 12 frames, 2 behavioursets require execution, while every 24 frames 3 behavioursets require execution. When the PM is working with large numbers of behavioursets, with a diverse range of execution periods, process peaking will impact on the effort to maintain a constant frame rate.

There are two main methods for minimising the number of processes executing on individual frames:

(a) Specifying execution periods as prime numbers can reduce process peaking. Compare the previous example with the following: now the three behavioursets have execution periods of 3, 5 and 7 (all prime numbers). This time there are 3 cases where there are 2 behavioursets requiring execution occurring at intervals of 15, 21 and 35 frames, while every 105 frames there are 3 behavioursets requiring execution. Table 2 summarises the number of occurrences of process peaking under the different execution periods during a 105 frame period.

Table 2. Process peaking under different execution periods

 

Execution periods 4, 6, 8

Execution periods 3, 5, 7 (prime)

Table 2 shows that when using prime execution periods process peaking is reduced, despite the fact that each behaviourset now executes more frequently. The instances of process peaking with 2 behavioursets reduce marginally, and the occurrences of 3 behaviourset process peaks are reduced by a factor of 4. The current PM does not incorporate prime periods in its design; instead, users are simply informed of this useful fact.

(b) Process peaking can be reduced by delayed activation. Agents may have the same execution period without ever executing in the same frame. This is possible through the introduction of a mechanism for delaying activation of agents. Take the example of two newly created agents, each requiring execution every two frames. By delaying the activation of one of the agents by one frame after agent creation it is possible to interleave the execution of the agents so that only one executes each frame, thus minimising process peaking. The same principle can be applied to behavioursets within agents. The PM user can specify activation delays for behavioursets at compile-time, while the PM itself can handle activation delays for agents at run-time.

Two possible cases exist for the interaction of execution period and activation delay:

· Some or all of the agents/behavioursets with different activation delays are put permanently out of phase, and will never all execute simultaneously.
· The processes all execute simultaneously before they would if no activation delays were specified, but subsequently execute simultaneously at the normal interval predicted by the lowest common multiple of their execution periods.

The important point is that specifying activation delays cannot increase the period of process peaking (compared to simultaneous activation), it can only decrease it.

The PM reasons about the density of processes executing within the game, and decides on the best way to fit new agents and their behavioursets into this environment. The PM uses the following formula to calculate the number of processes executing in a future frame at time step t.

wright_eq_02.gif

 

Where

P = number of processes,
n = number of agents,
t = time step under examination,
ai = activation time of agent i,
pi = execution period of agent i,

 

The formula can be modified to consider behavioursets within agents with different activation delays and execution periods. This formula is used in the pseudo-code in figure 3 to determine the optimal activation delay for a newly created agent. d is the maximum activation delay possible for a newly created agent, while l is the lookahead distance over which the number of process peaks is examined. A reasonable value for d could be 3, while l could be set to the largest behaviourset execution period within the agent, plus its associated activation delay. These settings allow newly created agents to have their activation delayed by up to 3 frames, based on an assessment of future process peak levels including at least one execution of all the agent's behavioursets. For a more comprehensive search of future process peaks, the lookahead distance l can be specified as the lowest common multiple of all the agent's behaviourset execution periods. The lowest common multiple is the maximum length of repeated process peaking patterns; therefore, specifying this as the lookahead value ensures that the algorithm considers all potential future process peaks. However, this approach is computationally expensive, as the lookahead distance may be large, particularly if execution periods are specified as primes.

For d time steps from current time step
Calculate magnitude of process peaks over next l time steps
end for

choose activation delay resulting in smallest maximum process peak

The complexity of the algorithm is O(npn) where n is the number of agents under consideration and pn is the maximum period among all the new agent's behavioursets plus that behaviourset's associated activation delay. This is a pseudo-polynomial time algorithm.

The parameters for the lookahead algorithm trade minimising process peaking with maximising the expense of the search. Therefore, the algorithm can be used either with a small lookahead at run-time for dynamic activation delay, or with full lookahead as an offline optimisation tool for sets of agents and behavioursets. With dynamic activation delay the PM will avoid unnecessary process peaking when new agents are added to the AgentList at run-time.

"Level of Detail" AI

Sometimes it is desirable to modify the priorities of agents while a game is running in order to control the allocation of AI time to different parts of the game AI. For example, in a crowded city environment, the priorities of pedestrians further away from the camera could be reduced. Such functionality is necessary to implement the egocentric principle. Reducing the priority of agents far away from the player is analogous to the graphical technique of using different 'level of detail' models for rendering according to distance from the camera.

The PM allows run-time modification of agent priorities. It may also be useful to modify behaviourset priorities at run-time. The PM allows this also. Similarly, it may be necessary to modify execution periods of agents, and potentially behavioursets. Modifying execution periods may unbalance the AI computational load resulting in process peaks. Nevertheless, for flexibility, facilities are provided to modify execution periods for both agents and behavioursets.

AI Granularity

The PM is a non-pre-emptive multi-tasking scheduler: it only regains control of the processing thread when the code "atom" (e.g., behaviour in a behaviourset) terminates. As the PM has to wait until AI code returns control, it is inevitable that overruns will occur. For example, a behaviour might be allocated 100 clock cycles, the PM invokes the behaviour, but when control is returned 200 clock cycles have elapsed. In consequence the PM cannot guarantee to terminate within the specified AI time: it can only approximate to the specified upper bound. This can lead to some agents or behavioursets not having time to execute. To alleviate this, we allow execution of agent/behaviourset lists to resume after the last executed agent/behaviourset, as well as allowing time allocation to agents/behavioursets to be continually recalculated based on the remaining available AI time in a frame.

We decided to avoid the complication of pre-emptive scheduling (interruption of AI processes by the PM at any time). This does not mean that pre-emptive scheduling is not worth considering. However, on machines that make extensive use of instruction and data caches, such as the PlayStation 2, pre-emptive scheduling can reduce run-time efficiency.

Implementation

We developed a test-bed consisting of a population of agents inhabiting a 2-dimensional toroidal environment. Each agent moves within the environment according to a speed and heading, which they are able to modify subject to certain constraints on maximum acceleration/deceleration and turn rate. Each agent has two goals, to follow a target agent, and to avoid collisions with other agents

Three versions of the testbed were implemented to demonstrate the benefits and costs of using the PM to execute AI code. The different versions are listed in table 3.

Table 3. Testbed versions

Version

Non-scheduled naïve

Non-scheduled

Scheduled by process manager

Figure 2 presents performance results for the different testbed versions. Maximum performance is the number of agents that cause "performance degradation". In the case of non-scheduled testbed versions, "performance degradation" means dropping from a frame rate of 60 frames per second. In the case of scheduled testbed versions, "performance degradation" means failing to completely execute all agents and their behaviours when scheduled to do so. "DAD" means "dynamic activation delay". Two scheduled versions of the testbed were evaluated, one with no dynamic activation delay, the other with a maximum dynamic activation delay of 2. The results show that while the performance hit from using the (non-optimised) PM is around 20%, much of this can be reclaimed through dynamic activation delay.

wright_02.gif

Figure 2. Testbed performance results

More importantly, the scheduled version can give the appearance of handling more than 67 agents by suspending the execution of some AI behaviours. The scheduled DAD version can process 169 agents before frame drop. In this situation, some of the agent behaviours are executing less frequently. By changing the priorities of agents at run-time and re-ordering the AgentList, game programmers can control which agents will lose AI resources first. A fuzzy upper limit and graceful degradation of AI performance replaces a sharp cut-off point for loss of frame rate. This is very useful for games with a variable and unpredictable number of agents that consume a variable and unpredictable amount of CPU time. This means that the need for manual optimisation of AI code is reduced. If your game allows "believable" rather than "accurate" AI processing then the limiting factor becomes drawing time rather than CPU time. You just might be able to satisfy that design brief after all.

Careful readers will note that the use of the process manager will require programmers to specify extra parameters for their code, such as whether code fragments may be interruptible, how much share of CPU time should be allocated to code fragments, and so forth. This can represent an increase in AI code development time. However, the important point is that time spent specifying process manager parameters will save time in the area of AI optimisation: due to the fuzzy upper time limit on AI processing per frame the point when hand optimisation of AI code is required is postponed. We anticipate that use of the process manager will save time overall, as hand optimisation of all types of code, not just AI, is a time consuming process.

AI Hardware Acceleration?

Universal methods are a prerequisite for hardware acceleration. Computational, projective geometry is the universal basis of graphics programming, but there seems, at first glance, to be no such common basis for AI. Defining a general-purpose execution "pipeline" for AI processing is a step in the right direction. We think that the Process Manager describes such a pipeline: AI code can be run upon it, and there are benefits from doing so.

As stated, the process manager makes no assumptions about the precise nature of the AI code. The process manager is identical to an operating system in the sense that it can run multiple different threads performing different kinds of computational work, and imposes no restrictions on the type of code that may be run. However, there is an issue of code granularity. Some AI mechanisms are more amenable to time-slicing than others. For example, we are working on a rule-based, declarative language to sit on top of the process manager. Our reasoning is that rule-based languages are much easier to interrupt, and exhibit finer execution granularity than their procedural counterparts. The higher the code granularity the better the process manager can allocate time. This is an issue of performance rather than competence. The process manager really can run any kind of code, whether AI or not, or whether the code content is evolutionary computation, finite state machines, simple flocking algorithms or any other mechanism for that matter. However, the granularity of that code can affect process manager performance.

But this is just a start. Chips that truly accelerate AI processing will need to deal with two main themes.

First, game AI code generally exhibits a coarse-grained parallelism at the agent level. Characters in a game "sense" the current game state, perform some "internal" processing and then "act" upon the game state to produce a new game state. The internal processing of each agent could occur in parallel without requiring significant changes to AI logic. Future AI accelerators could impart huge processing speed-ups with this kind of parallelisation. Note that this isn't fine-grained parallelism at the instruction level, but a coarse-grained parallelism at the agent level.

Second, game AI code is essentially rule-based. In the abstract, AI code maps conditions on game state to changes to game state. In procedural languages, such as C and C++, the mapping is implemented as "if then" rules acting on data structures. Despite claims to the contrary, AI code is very ordinary in this sense: it is like other code, except it usually exhibits a higher density of condition tests that tend to predicate on a highly dynamic game state. AI hardware accelerators need to directly address the rule-based nature of AI code. Progress in this area is more difficult, as it requires developing and using new real-time, rule-based AI languages while simultaneously designing hardware that speeds the execution of such languages.

Ian Wright began creating games in the early 80's at the age of 14, developing two hit games for the ZX Spectrum home computer. He received a PhD in Artificial Intelligence from the University of Birmingham in 1997. He now develops AI technologies for PS2 at Sony Computer Entertainment Europe's "Team Soho" development studio. Ian can be reached at [email protected].

James Marshall came from a background in computer science, via artificial life research and part-time commercial game development on the Amiga, to his present position at Sony Computer Entertainment Europe's "Team Soho" development studio. He currently works on core AI technologies for PlayStation 2 games. James can be reached at [email protected], unless he's climbing mountains.

 

Read more about:

Features

About the Author(s)

James Marshall

Blogger

James Marshall came from a background in computer science, via artificial life research and part-time commercial game development on the Amiga, to his present position at Sony Computer Entertainment Europe's "Team Soho" development studio. He currently works on core AI technologies for PlayStation 2 games. James can be reached at [email protected], unless he's climbing mountains.

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like