informa
/
Features

Mathematics In Videogames

There are many mathematic concepts that underlie game programming. In this article, Muller covers the basics for people just starting out in game programming.

Writing computer and video games is an exciting job. Apart from the parties, flexible working conditions, freebie trips to Japan and the money, the work itself can be very satisfying. At its best it involves solving problems in groundbreaking new ways, more efficient in terms of time or space than was previously thought possible. But a good grounding in the fields of mathematics and science relevant to gaming is useful if you want to build on the knowledge that has been built up over the years. There are many techniques picked up by game coders over the years that have proved useful. If you look into them more deeply, you may even find new ways to apply them!

Many books have been written about most areas of math, science and computer science, but all too many of the advanced books have this in common: they were written by academics to get tenure. As such, the authors seemingly do not want normal people to read their work! Many of the introductory books are well written, but do not focus on using their principles in real life programs. Fortunately there are exceptions to this rule and a few books I found interesting are listed at the end of this article.

The growing power of computers has meant that mathematics and computer science are living subjects, with new ways of looking at problems being discovered each year. But still the ideas you use most often in games (which I am covering here) were written down decades (if not centuries ago).

What turns game coders off about mathematics (apart from a difficult writing style) is the apparent lack of relevance to games. This doesn't have to be the case! In this article I will try to show how some simple mathematical ideas are applied to game coding.

When I started writing computer games on the school computer in Fiji back in 1982 the games available commercially were very rudimentary, and didn't use most of these simple ideas. As a result, they could look jerky and simplistic.

In 1989 when I started writing games commercially most of these methods were being used by good programmers at the time. In fact that is how I learned some of them - from other coders rather than out of a textbook.

But now everything is in 3D. People are writing mainly in C instead of assembly language and the production values of games are a lot higher. A foundation in maths is almost taken for granted in new games programmers and yet some of the basic tricks we used to play around with in the 1980s seem to have been forgotten in the courses these bright young sparks have taken.

Enough history already! Let's get on to explaining how that stuff you fell asleep in class over is relevant to helping you write classy efficient game code. The material in one topic is often related to another topic, so you may find it useful to read through the article as a whole. Some topics are so huge they have had whole books written about them, so I have had to give them only a brief discussion.

Numbers

Numbers are the foundation of mathematics - you can't do much of any practical use without them. And everyone knows computers are great with numbers. So why are there so many different types of them and what are they useful for?

Integers: Whole numbers are easy

The numbers which range in steps of one unit both up and down from zero (·,-5,-4,-3,-2,-1,0,1,2,3,4,5,·) are called integers. Computers love integers. They can be stored exactly in binary and have efficient methods for addition and subtraction, as well as slightly less efficient methods of multiplication and division.

When used in computers, integers have upper and lower limits. For 16 bit signed integers stored in twos complement format (the format used for nearly all signed integers these days), the limit is -32768 to 32767. This limit is important when you are performing certain operations as overflow can occur (see the section on Pitfalls for details).

For many years integers have been favoured for storing variables used in games because operations on them (such as addition) were much faster than on other types of numbers. However these days the speed advantage of integers over floating point numbers (described later) has been somewhat diminished by fast co-processors on modern chips such as the Intel Pentium.

Scaling: Making the most of integers

If you store object co-ordinates or other values as integers, you may find that they lack precision. You may want to animate a sprite at less than one per frame, or move it by less than one pixel per frame. Do not do this by putting pauses in your code! You can allow much more flexibility by scaling the system of numbers you are using. For instance, you could divide the horizontal position of your main game sprite by 100 before plotting it on the screen. This allows you to give a horizontal velocity of 50 to the sprite to get it to move one pixel every second frame. It also allows you to give a wide range of velocities to it that cannot be emulated simply by pausing the movement for a number of frames.

Fixed point numbers: Efficient scaling

Dividing an integer by a scale factor is an expensive operation! So a more efficient way of doing this should be used. When the scaling factor is a power of two the division can be easily performed by using binary shifts (the >> operation in C). When a number is stored in this manner it is called a fixed point number, as the integer represents a number with both integer and fractional parts, separated by a binary point (like a decimal point but in binary) at a fixed position.

When games programmers say they use fast integer maths throughout their graphics routines, they often mean they are using fixed point numbers.

The most commonly used powers of two are 256 and 65536, but other ones are in use in many games (and in the operating systems of some next generation consoles). They can prove to be particularly efficient running on some machines.

Naming conventions for fixed point numbers specify the number of bits on both sides of the binary point. For example a 12.4 fixed point number has 12 bits of integral value and 4 bits of fractional value. If it is unsigned it can range from 0 to 4095.9375 in steps of 0.0625 (1/16 or 2-4).

Addition and subtraction are simple on fixed point numbers: just perform the operation on them as if they were integers (of course, they must be compatible by having the same scale factor). Multiplication on the other hand requires you to shift the result to the right before using it.

Floating point numbers: Flexible ranges

The main disadvantage of fixed point numbers is that they lack precision over a large range: they cannot hold very large or very small numbers, and when they do hold small numbers these are not very accurate. The way to get around this is to use floating point numbers. These are formed of two parts: a signed exponent (the range shift) and signed mantissa (the raw value). You can turn a floating point number into a fixed point number by shifting the mantissa by the number of bits specified in the exponent: left for positive exponents and right for negative exponents.

Floating point numbers are supported in hardware on some new processor chips like the Intel Pentium. The C and C++ languages support them as primitive types: float (32 bit), double (64 bit) and long double (80 bit). In machines which do not support them in hardware their operations have to be emulated in software, which is much slower.

Multiplication on floating point numbers is easier than addition! This is the opposite to the situation with integer and fixed point numbers. So when you are using floating point maths you may find algorithms different from the ones you use with integers appropriate.

Discreteness

Discreteness is a way of saying that values in computer games often have integer values. Time is not continuous - it is split up into frames of 1/60th of a second (for NTSC video consoles). Sprite positions are not continuous - they are displayed on pixel boundaries. This can be very useful as an approximation to a smooth universe.

Spatial discreteness: Aligning objects to grid boundaries

On nearly all video games consoles sprites must be placed on integral pixel boundaries. The exception is the Nintendo 64, which has hardware anti-aliasing built-in. This should allow it to display sprites on half-pixel or quarter-pixel boundaries. We will need to wait and see if 2D sprite based games for that machine are able to take advantage of this.

The use of fixed point (or floating point) numbers for game objects means that the objects are held with greater precision than is shown on the screen. When you are smooth scrolling a screen you need to stop its motion if it gets too small (e.g. less than 1/8th pixel per frame) to avoid jitter. In this case, to take account of spatial discreteness you should halt the scroll at the half-way point of a pixel rather than the edge. When the scroll restarts it will then have a fair chance of moving in either direction.

Temporal discreteness: Game frames

The concept of game frames is fundamental to traditional games programming. While it is possible to have an asynchronous event driven game, most games are based on the ticking of a clock which happens a fixed number of times per second.

Every time this clock signal is processed the game executes one cycle of thought. Each game object may move by a slight amount, some decisions may be made, and the display may be updated. Operations which you may have been taught to calculate all at once (such as the trajectory of a rocket) have to be done one step at a time.

What happens when you "miss a frame" and the game takes longer than expected to execute a game frame cycle (this is usually the fault of the graphics routines being overloaded by the current view being too complex)? How do you catch up and avoid the game slowing down? There are two main methods: running the game engine multiple times and scaling velocities.

The "multiple times" method is simple: you time how many frames you missed and then run the game code again that many times to catch up. This can work only because the game code is usually much faster than the display code - the display code does not need to be run multiple times. If the game code is very complex or the game is running on a slow machine, this method does not work. This method is often used on console games.

The other method (which works on slow machines) is to time how long each game frame is taking to process and then multiply each game speed by the appropriate number. This is rather harder for accelerations than velocities. This is the method most often used on PC games, because the machines have such a wide range of speeds.

In modern games the game frame is not the same as the display frame. The game is run in a separate "thread" from the display code, and so should be immune to the display code being bogged down by too many objects or a scene being too complex. This is a variation of the "run the game multiple times" solution, but requires some more thought to co-ordinate the game and the display thread. The shared variables that are written by the game code and read by the display code (such as sprite position and animation requests) should be double-buffered to prevent synchronisation problems.

Geometry

Classic or Euclidean geometry (named after the Greek mathematician, Euclid), is based on right angled triangles. It is customary to use Greek letters for angles, and in the following example I use q (theta) for the main angle and f (phi) for the other angle.

There are two points named, A and B. There are three sides, the adjacent, the opposite and the hypotenuse. The hypotenuse is the longest side in the triangle. The opposite is the side which is not near the angle you are interested in. The adjacent is the other side. There are two angles named in the triangle, f and q , as well as the 90¡ right angle. The sides are labelled with q being the angle we are looking at. In any triangle, the sums of the interior angles adds up to 180¡ . Therefore, q + f + 90¡ = 180¡ , so f = 90¡ - q . This means that one angle can be calculated from the other. The triangle can be defined by an angle and the length of one of the sides, or by the length of two of the sides. This leads to polar and Cartesian co-ordinates, but we will discuss those later.

Trigonometry: Those sine and cosine functions

What do the trigonometric functions mean? Well, sin q is the length of the opposite side divided by the length of the hypotenuse. Cos q is the length of the adjacent side divided by the hypotenuse. Tan q is the length of the opposite side divided by the length of the adjacent side. Since these functions take a lot of time to calculate, in a computer game you would generally store them in a look-up table. They are useful in converting between polar and Cartesian co-ordinate systems (this will be covered later in the article).

Pythagoras: Calculating distances

The Pythagorean theorem states that the square of the hypotenuse of a right angled triangle is equal to the sums of the square of the other two sides. This is useful in computer games because it lets you calculate the distance between two points accurately.

Now back to games. Imagine that there is an object at point A, co-ordinates (x1, y1). There is also an object at point B, co-ordinates (x2, y2). By the Pythagorean theorem, the distance between A and B is Ö ( (x2-x1)2 + (y2-y1)2 ). If you know how to calculate square roots, you can use this method to accurately find the distance between two points.

It is messy to calculate the angle between two points because there are four cases to consider, depending on which quadrant the right angle is in. This is discussed later.

If you know the angle q (you may have calculated it for use by your intelligence routines), you can calculate the distance between the two points by dividing the length of the adjacent side by cos q (if q < 45¡ ) or dividing the length of the opposite side by sin q (if q > 45¡ ).

Co-ordinate Systems

The concept of space in computer games requires numerical measures with which to set positions and orientations. There are many ways to specify co-ordinates, but only a few of them are in common use.

Cartesian co-ordinates: Grid lines across the world

This is the most often used system. Cartesian co-ordinates consist of a horizontal value (called x) a vertical value (called y) and for 3D games a depth value (called z). These three axes are orthogonal, which means they are at right angles to each other. They can be represented as a vector [x, y, z], which is displayed vertically when it is used in matrix multiplication (covered later).

Polar co-ordinates: Angles and distance

Polar co-ordinates consist of an angle and a scalar. They can be very useful for storing velocities of objects in two dimensional games (such as sports games). A person can be running in a particular direction and then change angle by a certain amount each game frame, while maintaining the same speed.

You can convert them into Cartesian co-ordinates by multiplying the scalar by the sine and negative cosine of the angle to get the horizontal and vertical co-ordinates respectively (this treats angle zero as being North and angle ¹ circle as being East. You can use other conventions in your code. The mathematical convention is for angle zero to be East and angle ¹ circle to be North).

Angles can be stored in several ways. Mathematicians say a circle is made up by 2p radians. Sailors use 360 degrees. French dictators rule that it is 400 gradients. But game coders generally use 256 units, as this is stored nicely in one byte.

Quaternions: 3d rotations that work

This is a great big topic of its own, so I must be brief. Quaternions are four dimensional numbers (in the same way that the complex numbers you learned about at school are two dimensional). They have a set of mathematical operations defined on them, such as multiplication and addition.

They are very useful for processing rotations in three dimensions. If you are controlling a spaceship like the one in Descent, you want a roll to be a roll whichever way you are facing. Quaternions allow you to do this. They are also useful for interpolating motion capture animations.

Frames of Reference: Screen (local) and World (global) co-ordinates

This is a fundamental idea that many programmers in the early 1980s had not heard of. Fortunately these days it is taught quite extensively because of its use in 3D graphics.

The basic idea is that the objects in the game have an existence in a game world which exists even when you can't see it. Their co-ordinates are all given relative to the game world, not the screen. In the old days some people would store all enemy creatures as screen offsets and change them directly when scrolling the screen. This was not a good way to code!

You should treat the view of the screen as a window into the game world. In 2D games this is a rectangle which covers part of the world. When a view is set to a particular area, those objects which lie within that region of the world will be displayed on screen. Those outside the area are invisible, as they are off-screen.


You can translate positions between world and screen co-ordinates by applying a transformation matrix to them. In the case of simple platform games with no scaling or rotation, this simplifies to adding a scroll offset to both co-ordinates.

Note that the "panel" objects can be treated differently, and referred to in screen co-ordinates, as they relate to your view rather than the game world itself.

Orthogonal systems: Treating co-ordinates independently

Cartesian co-ordinates are "orthogonal". The beauty of this is that they can be treated completely independently of each other. This means that often something we do (such as apply friction) in one dimension (such as x) we can do the same way to the others (y and z) without worrying how they interact.

Transformation matrices: Converting between frames of reference

In two dimensions, you can translate between frames of references by multiplying by a 2 by 2 matrix and adding a 2 element vector. This is not so hard as it looks, as this is just a mathematical way of showing two simple formulae.

screenx = a*x+b*y+e

screeny = c*x+d*y+f


What is interesting is the value of the constants a, b, c, d, e and f. For translation (just scrolling) a and c are one and b and d are zero (this is called an identity matrix) and e and f are the scroll offsets. For scaling, b and c are zero and a and d are the scale factors. For rotation the following formulae can be used, where angle is the angle of rotation of the screen relative to the world and scale is the relative scale of screen co-ordinates to world co-ordinates. E and f still represent scroll offsets, but you have to adjust them to take account of the rotation if you do not Îpre-adjust' the x and y co-ordinates before plugging them into the equation.

a=scale*sin(angle)

b=scale*cos(angle)

c=scale*cos(angle)

d=scale*-sin(angle)


The nice thing about using matrices to transform co-ordinates is that various transformations can be concatenated (made to occur one after the other) by multiplying the two matrices. With three dimensional graphics, a four by four matrix can hold everything you want to know about the transformation, including the angle of the viewing frustum (i.e. whether or not the camera is fish-eyed) and the way the perspective is calculated, as well as the translation, scaling and rotation required to render the scene.


Iteration

What computers do best is repetitive instructions. This has led to such useful instructions as for and while loops in C. In games, the temporal discreteness I mentioned earlier forces us to write games in a different way from a command-line utility. The interactivity of a game requires there to be no long pauses in gameplay while the computer figures out what to do. The most common way to ensure this is to split up calculations into bite sized pieces, and execute them one at a time. When we do this, we must figure out how to make sure these pieces are all executed in the correct order, since this can wreak havoc with traditional control structures such as for loops. The main ways of managing this are with states or threads.

States: Controlling the motion

The concept of a finite state machine is fundamental to the theory of computability, developed significantly by Alan Turing, the gay mathematician who also helped crack the Nazi's enigma code during the Second World War. (See http://www.wadham.ox.ac.uk/~ahodges/Turing.html or the book Alan Turing: The Enigma.)

The idea is that a computer consists of a state, a program and access to a certain number of variables. In the case of a Turing machine this is serial access to one read/write location on a large tape, but we will use random access to a fixed number of variables for speed. A program is simply a list of rules which say which state to go to if you are in a certain state and certain input variables are found. To increase speed and reduce program storage requirements from those of a Turing machine, we can use expressions to work out which state we want to go to. The state is the output of the routine.

Unstable or "virtual" states can be useful in some circumstances, but they require great care to avoid infinite loops. In this system, when a state change is made the program immediately jumps to the start-up code for this state. This code may decide that it needs to change state again, before the routine exits. So it is important to make sure that it is impossible for the processor to be caught up in a loop, constantly cycling between states without returning control. With care though, this can reduce the amount of code required to implement intelligent behaviour quite significantly.

Figure three has a list of states a simplistic enemy might use.

 


Figure three

Marching

Init

Start moving in a random direction.
 
Control

If player is close then aim, else remain marching.
 
Action

If we hit an obstacle, reverse direction.

Aim

Init

Face the player
 
Control

If player is above me, start jumping, else charge
 
Action

None

Jumping

Init

Set an upwards initial velocity
 
Control

If landed, charge, else remain jumping
 
Action

Fall under gravity.

Charge

Init

Start moving fast towards the player.
 
Control

If player is far then start marching else remain charging.
 
Action

If we hit an obstacle, reverse direction.


In this case, Aim is an unstable state, which leads directly to another state. It may be the target of several different state changes.

The important thing to keep track of in a state driven system is which variables are preserved across state routine calls. If the AI routine simply returns when the state processing is finished, any variables on the stack will be lost, so a pointer to object specific variables is essential.

Note that using pointers to functions is much faster than parsing an enumerated state through a massive switch statement! It also makes the code more readable.

States can be implemented in code or as tokens. In a token driven game, each object's AI is controlled by a list of tokens which represent commands. This is similar to the way byte codes work in Java, but the tokens are at a higher level and much more game specific.

Threads: Letting the operating system do the hard work

Keeping track of object states doesn't always have to be a pain. There is a way of generating "anonymous states" which puts the onus of the housekeeping work on the game's operating system (usually low level scheduling routines rather than the actual host operating system) instead of in the code for each object state.

This method is the sleep command - a form of non-preemptive multitasking. You just write a function called sleep which takes as a parameter the number of game frames you want the process to sleep for, or alternatively a handle to some condition which will wake up the routine, such as the player getting within a certain distance of the object.

When the sleep command is called, control passes out of the object's AI routine back to the game's scheduler. This means that in effect each object runs in its own thread. A round robin scheduling method can be used to give processor time to the AI routines of all objects, or threads can be placed in multiple linked lists according to the frame number in which they will awaken.

You must decide which variables are to be preserved by the sleep command. You may prefer not to preserve registers for example, but often need to preserve the stack (and hence all local variables as well as the state of any calling routine). To do this you can assign a local stack to each object, and then switch the stack pointer (this may require assembly language J ).

This technique can require quite a bit more memory than the state method since it is hard to predict how much stack space will be required by any routine. The programmer must also be careful to switch to a system stack if any hardware interrupts occur, to avoid doing nasty things to the object's local stack.

Other techniques can be used to make more efficient use of memory - some processors can automatically detect when a stack overflow occurs, and you may wish to use this to allocate variable sized local stacks to your objects. I consider this method too risky in terms of guaranteed response time to be used in a game.

Background Routines: The revenge of complex calculations

The use of threading can also allow us to use time-consuming calculations as well - if the threads are pre-emptive. This is usually implemented by using hardware interrupts (Java interpreters can emulate this in software).

You can split the game into three parts for example - the display thread, the main AI thread, and the strategy thread. The main AI thread is called at fixed intervals to ensure the game runs at a constant speed. The display thread is called less often (unless you have a very nice machine!) to keep the screen up to date with what is happening in the game. The strategy routines can run at a low priority in the background, and occasionally talk to the main AI thread. This allows them to perform complex calculations without worrying about slowing the game down.

The important thing with pre-emptive multi-threaded games is to handle synchronisation issues well. This is a topic that is well covered in computer science courses. To summarise: double buffer all systems of shared variables until a "transaction" (i.e. a game frame) is made. The transaction must be "atomic" (i.e. not interruptible). This can be done using either hardware methods (disabling interrupts for a short period of time) or software methods (the bankers algorithm).

Synchronisation bugs can be hard to remove, so build multi-threaded systems carefully from the start.

Feedback: Giant oaks from acorns grow

One of the greatest ways to make behaviour unpredictable (and therefore interesting) is by the use of positive feedback to accentuate small differences in a chaotic way. This can produce results as interesting as a Mandelbrot set (generated from the function A' = A2+C where C is a constant complex number and A is a complex variable initially set to C. The number of iterations it takes the magnitude of A to exceed 2.0 forms the Mandelbrot set). This positive feedback produces very interesting patt

Latest Jobs

Sucker Punch Productions

Bellevue, Washington
08.27.21
Combat Designer

Xbox Graphics

Redmond, Washington
08.27.21
Senior Software Engineer: GPU Compilers

Insomniac Games

Burbank, California
08.27.21
Systems Designer

Deep Silver Volition

Champaign, Illinois
08.27.21
Senior Environment Artist
More Jobs   

CONNECT WITH US

Register for a
Subscribe to
Follow us

Game Developer Account

Game Developer Newsletter

@gamedevdotcom

Register for a

Game Developer Account

Gain full access to resources (events, white paper, webinars, reports, etc)
Single sign-on to all Informa products

Register
Subscribe to

Game Developer Newsletter

Get daily Game Developer top stories every morning straight into your inbox

Subscribe
Follow us

@gamedevdotcom

Follow us @gamedevdotcom to stay up-to-date with the latest news & insider information about events & more