Back before Al Gore invented the Internet, back when 64KB was more memory than any computer would ever need, there was a time when memory managers didn’t exist. But gradually, new computer systems came out with larger amounts of memory (Figure 1). Game designers discovered ways to eat up RAM faster than any system could dish it out. It became obvious that managing all this data with a simple static memory map was too much work. Programmers needed a way to create memory maps that could change dynamically at run time depending on the state of the program. Thus, the memory manager was born.
![]() |
![]() |
![]() |
||
![]() |
||||
![]() |
Figure 1: Main memory growth for console game machines.
|
For an excellent introduction to how a basic memory manager works, check out www.memorymanagement.org/articles/begin.html. This article will assume this basic knowledge and focus instead on things to think about when writing your own memory manager and point out some of the problems you may encounter in using a memory manager. This discussion is based on our experiences in writing and rewriting the memory manager for Madden NFL 97 to Madden NFL 2002. This article is also slanted toward the C language as written for console game machines. The same principles would apply to C++ and Windows programming, but the details are likely different.
Growing Your Own
So you know that memory managers are great, but what’s wrong with just
using malloc and free?
They are part of the standard C library, so they should be the best, right?
One major problem is that you don’t really know what they are doing. If
you write your own memory manager, you can generate statistics, add debug
code, and add advanced features that might not be found in the default
memory manager — features like handles, heap compaction, and virtual memory.
The ultimate weapon of game programmers is context. You know the needs
of your game and can tailor the memory manager accordingly. Developing
a successful memory manager requires addressing five different issues:
ease of use, performance, memory over-head, fragmentation control, and
debugging support. Unfortunately, these attributes are all interconnected,
and optimizing one can result in substandard results for another. The
memory manager designer must govern a delicate balancing act.
At a lower level, memory manager design also requires paying attention to platform-specific requirements. In addition, it may be possible to utilize hardware support to assist the memory manager.
Ease of Use
While supporting both handles and pointers is possible, the resulting
memory manager is more complicated than one that supports a single method.
Madden used to support both handles and point-ers until an analysis
showed that pointers were being used 99 percent of the time. Not surprisingly,
when given a choice, programmers used pointers, since they were the easiest
solution. Therefore, we simplified the latest Madden memory manager
by removing handle support and optimizing the pointer support.
Performance
Performance must be addressed both in terms of speed and consistency.
A memory manager that is fast most of the time but slows down dramatically
at unpredictable times is unacceptable in a gaming environment. Therefore
it is important to understand the issues that contribute to performance.
From the memory manager user’s point of view, two operations will impact
the game: allocations and recycling.
Allocation performance is determined by allocation poli-cy, free list
management, and the use of garbage collection. The most popular allocation
policies are best fit, first fit, and next fit. By organizing the free
blocks as a linked list, best fit has consistent O(n) performance,
while first fit and next fit have worst-case O(n) and on average
O(n/2). By organizing the free blocks as a size-sorted binary tree,
best fit has worst-case O(n) and on average O(n log n).
By organizing the free blocks as a balanced binary tree, best fit has
consistent O(n log n). Garbage collection applies only to
memory managers with handle support, and generally involves a fairly significant
performance penalty when it occurs, as it essentially makes a copy of
the entire memory heap during compaction.
Recycling
performance is based on free list management and the use of free block
coalescing. Free block coalescing requires the ability to locate the blocks
immediate-ly preceding and following any arbitrary block. Some memory
structures, such as boundary tag (see Donald Knuth’s Fundamental Algorithms
under References for more information on the boundary tag), allow this
in O(1) time, while those that don’t require an O(n) scan of the
free block list. The current Madden memory manager uses boundary
tags to allow O(1) access to previous/subsequent blocks and organizes
the free list as an unbalanced binary tree. The result is allocation and
recycling performance both in the O(n log n) to O(n)
range.
MemoryOverhead
Memory overhead is the cost we have to pay to manage memory, since each
allocation will cost some memory. Memory overhead is an important consideration,
especially if the memory manager will be handling a large number of objects.
The first decision is whether the memory map state should be stored internally
or externally to the memory being managed. If it is stored internally,
then the maximum number of allocated blocks does not need to be known
in advance, but memory that is not directly CPU-addressable cannot be
managed. If it is stored externally, you must know the maximum allocated
and free blocks in advance and set aside memory for this state, but address
spaces that are not directly CPU-addressable can be managed.
Madden previously used external storage for the memory state, but
this required additional overhead because it was impossible to predict
accurately the maximum number of allocated and free blocks. We had to
include a “safety factor,” which turned directly into wasted memory. The
new memory manager uses internal storage as shown in Figure 2, thus avoiding
the entire issue. All allocations are 16-byte aligned and each block has
16-byte overhead. By limiting allocations to a minimum of 16 bytes, every
block is guaranteed to have this much storage available. Therefore, when
an allocated block is released, those 16 bytes can be used to organize
the block into the free list.
It is worthwhile to digress slightly and consider the management of non-CPU-addressable
memory. Because consoles are designed for low cost and high performance,
they sometimes incorporate special-purpose memory that is not directly
CPU-addressable. For example, the Gamecube has 16MB of auxiliary memory
(ARAM) that supports only DMA access. A memory manager that stores its
state internal to the memory it is managing cannot be used in such cases,
while a memory manager that stores its state externally can. While it
may seem appealing to use external state storage in order to support all
kinds of memory, our experience with Madden has shown this to be
a mistake. Memory that is not directly CPU-addressable is normally used
only for special types of objects, due to the com-plexity of using the
memory, and often contains only a small number of objects. Therefore,
Madden now uses a single, general-purpose internal storage memory
manager for all CPU-addressable memory and an additional, customized external
storage memory manager for any special-purpose case, such as the Gamecube’s
ARAM.
Fragmentation Control
![]() |
Figure
2: Memory structure image.
|
Memory fragmentation
is a condition that occurs over time as memory is allocated and released
and isolated free blocks form. This is not usually an immediate problem,
but as more fragments form and they get smaller, the opportunity for an
allocation failure due to fragmentation increases. Eventually, if fragmentation
gets severe enough, an allocation may fail because no single free block
is large enough to accommodate the request even though the total amount
of free memory is (Figure 3). Therefore, a good memory manager needs to
either take steps to limit the amount of fragmentation that occurs or
be able to consolidate fragmentation periodically through garbage collection.
An allocation failure is often a lethal error for game code and must be
avoided at all costs. While it is generally fairly easy to determine the
maximum memory required by an application, fragmentation can make such
a calculation meaningless. Allocation policy, free block coalescing and
multiple heaps all play a part in minimizing fragmentation. Free memory
coalescing is a straight-forward technique for limiting fragmentation
that attempts to merge a newly released memory block with its neighbors.
If the block immediately preceding or following the newly released block
is also free, merging the blocks together results in a single, larger
free block. This particular technique is almost mandatory, as without
it, fragmentation occurs very quickly. In addition, this limits the size
of the free list to the minimum number of blocks, which generally has
a positive performance impact on allocation. Since this technique has
no impact on how the memory manager is used externally, it is incorporated
into most memory manager designs.
The choice of allocation algorithm has a definite impact on fragmentation.
Unfortunately, because the size and timing of allocations and releases
vary for every application and are often different from run to run of
an application, it is impossible to say that a particular algorithm is
always better or worse than another. However, in general, it has been
found that “best fit” algorithms tend to exhibit the least fragmentation.
See Johnstone and Wilson’s “The Memory Fragmentation Problem: Solved?”
in References for more information on the impact of allocation techniques
on fragmentation.
![]() |
Figure 3 : Although 128 free bytes are available, the largest possible allocation is only 64 bytes due to fragmentation. |
A technique
that requires more involvement from the memory manager user is to allocate
blocks with similar lifetimes close to each other. If the lifetime of
a memory block is known when it is being allocated, the allocation algorithm
can attempt to place it by other blocks with a similar lifetime. When
these blocks are released, they will be merged into a single large free
block. The problem is that this requires the caller to specify both the
lifetime of each block as well as the total size of the group of blocks
of similar lifetime. As a result, a simpler version of this technique
is usually implemented through the use of multiple heaps. By allocating
blocks with similar lifetimes within their own heap, a similar effect
is achieved, though there are generally practical limitations on the number
of heaps that can be effectively utilized.
Debugging
Support
One of the main benefits of writing a memory manager is the abilityto
add capabilities that are not provided in the supplied memory manager.
By adding some debugging features you can make sure that you are managing
the memory rather than stepping in the heap. To avoid stepping in the
heap, you’ll need to be able to see the heap. Adding information and tools
to help visualize memory usage can be a big help when managing memory.
This can be done in a number of ways.
The debugging
techniques used by the Madden memory manager include the ability
to audit memory usage, collect usage statistics, check for consistency,
and follow the memory map from a debugger. Auditing is really nothing
more than being able to mark a group of blocks when they are allocated
and later check to see if they were all released. Usage statistics, such
as maximum number of allocated and free blocks, as well as maximum allocated
memory, are valuable for evaluating the memory utilization of the application.
Consistency checking goes through the memory map and makes sure there
are no detectable errors in it. This means verifying block sizes, checking
boundary tags, ensuring matching block counts, and so on. By performing
consistency checking prior to every allocation and release (in a debug
build), certain forms of memory corruption can be detected quickly. Also,
the memory map contains ASCII debugging information that can be viewed
from within the debugger. Prefixing every allocated block with a text
string indicat-ing the module and line number of the caller that allocated
the memory greatly assists when debugging after a crash. Naming the memory
blocks is a necessity. Imagine going to a store where every item was in
a plain box that only had its UPC code and box size printed on it.
Finding the item you wanted to buy would be a big challenge. Likewise,
if you have a list of memory blocks that only contains their address and
size, locating memory blocks is going to be difficult unless you give
those memory blocks names. Using the source file name and line number
where the allocation occurred would be a good start. In C, this can be
accomplished quite easily through the use of macros and the C preprocessor.
Now you can recognize your blocks in a crowd, but it sure would be nice
to know who they hang out with. By adding a group label or flags, you
can group your allocations at the conceptual level rather than being limited
to grouping your blocks by source file. This way you can know that a memory
block that was allocated is really being used by your main character,
even though the actual memory allocation occurred in your texture library.
Figure 4 shows an example memory dump as seen by a (slightly drunk) debugger.
Including debugging information in textual form within the memory map
allows you can make sense of it from the debugger. For example, if you
had an invalid address access at 0x007cfea4 , you could find that address
in the debugger and page around it to see that it occurred in the Joystick
Driver library and that the memory in question was allocated by pJoyDrvr.c
at line 225.
![]() |
![]() |
![]() |
||
![]() |
||||
![]() |
Figure 4: Memory blocks with debug information as viewed in the debugger. |
With all this information available, you will need to find ways to view that information that can help you when managing your game. A linear list of memory blocks can be useful for spotting potential memory fragmentation, while a list of memory blocks sorted by name can be useful when you have memory leaks.
If you do
find a memory block that has leaked, you will know from its name exactly
where the allocation occurred. With group labels you can print out compact
memory maps that show how each conceptual module in your code is using
resources. By tracking this throughout the project, you can easily spot
modules that are using more or less memory than you had budgeted in your
original design. You can also create functions to check whether groups
of memory allocations have been freed. This can help prevent memory leaks
if you know that in certain situations some groups have no allocations.
Keeping track of worst-case situations is also important. Have the memory
manager save the lowest amount of free memory it has ever encountered
(update this every time an allocation occurs). If you are using some sort
of node system to keep track of your memory blocks, keep track of the
highest number of nodes that the memory manager has ever used so that
you know if you are close to running out of memory blocks. Sentinels added
to the beginning and end of memory allocations can help detect overflow
situations that would normally corrupt other data in memory. If the sentinels
don’t check correctly when the memory is freed, then some code has been
bad.
Filling memory with recognizable patterns can be extremely useful. We
use a different pattern for unallocated, allocated, and freed memory.
This way, we can tell in the debugger at a glance what the situation of
a particular piece of memory is. When someone forgets to initialize memory
they allocated properly, the “allocated” pattern shows up.
You can also have functions that scan free/unallocated memory and make
sure that it all still matches the prespecified patterns. If it doesn’t
match, some code out there is incorrectly writing to memory locations
that it doesn’t own. Finally, make sure that you set up an extra heap
for debug memory and put all this extra debug information there. You want
your game memory to be as similar as possible between a debug and final
build.
Platform Specifics
A memory manager presents a logical view of memory in the same way that
a file system provides a logical view of a disk drive. Most often, memory
managers are concerned with managing dynamic RAM of some sort. Some console
makers like to make things more interesting by providing a relatively
large main memory but also scattering other smaller chunks of RAM around
the system.
The memory
manager allows us to abstract away the physical details of the memory
subsystem and deal instead with a nice, logical model of memory. For example,
on the PS2 we don’t necessarily need to know that the first megabyte of
RAM is reserved for the operating system. It’s enough that the memory
manager knows. By abstracting away some of the details of the physical
memory system, our game can become more platform independent.
Most console hardware has alignment requirements that, not so surprisingly,
differ from platform to platform. Many platforms require anywhere from
4- to 64- byte alignment for buffers in graphics rendering or file IO.
Each type of hardware might need the memory manager to be tweaked to better
fit the needs and abilities of the platform. Often this information can
be passed to the memory manager using a heap setup structure.
Finally, you should be wary of third-party libraries that may use malloc
and free , effectively bypassing your memory manager’s interface. The
printf function in the PS2 standard C library uses malloc and free ; our
solution was to write our own printf function.
Hardware
Support
On most modern computers, the issue of fragmentation has been greatly
reduced by the use of a hardware-based paged memory manager unit (PMMU).
Obviously, the fact that virtual memory provides an application with lots
of addressable memory means that even an inefficient memory manager can
be used. However, the more interesting point is that the PMMU without
any secondary storage can dramatically help with fragmentation. The PMMU
takes a very large address space (larger than the physical RAM it represents)
and maps it onto physical memory. Obviously, this mapping is not one-to-one,
but rather it maps a subset of the memory space onto a “working set” of
memory pages.
The key impact of using a PMMU in terms of fragmentation is that when
a memory block is released, any portion of that block completely spanning
one or more PMMU pages can be remapped by the PMMU. The result is actually
two forms of fragmentation: address-space fragmentation and memory fragmentation.
While this effect might seem to make a bad problem worse, it actually
simplifies things. Because the PMMU provides a large address space, the
address-space fragmentation can be largely ignored. Instead, the allocation
algorithm concentrates on minimizing memory fragmentation by first attempting
to utilize the unused portions of allocated memory pages (pages that contain
some allocation but are not completely full) before remapping unused memory
pages back into memory.
Managing Your Memory Manager
Memory managers are like most real-life managers. You have to keep them
under control, or else they tend to wander off without adequate information
and make strange decisions. Your memory manager sometimes needs your help
to stay on the right track too. Let’s look at some common problems that
can occur and how to work around them.
Fragged again.One of the major side effects of hand grenades and memory
managers is fragmentation. Previously, we discussed fragmentation in the
context of designing the memory manager to avoid or reduce the effects
of fragmentation. However, there are also some application- level techniques
that can reduce fragmentation.
The use of static allocation (memory defined within a module with storage
allocated within the application image) avoids the memory manager completely
and thus avoids fragmentation. Of course, this usage really only makes
sense when a single object will be represented and when the lifetime of
that object is approximately the duration of the entire application. In
such cases, static allocation can provide a benefit by limiting utilization
of the dynamic memory manager to those memory objects that are truly dynamic.
Another strategy that relies entirely on the memory manager user is to
perform submanagement of memory based on specific program knowledge. For
example, if a module knows that it needs to allocate X objects of size
Y, it may be far more efficient for the user to allocate a single block
of X * Y bytes and perform its own management within that larger block.
By reducing the number of blocks that the memory manager has to deal with,
the user has generally made the job of course, a caveat. Depending on
the amount of fragmentation and the size of X * Y, it is possible that
the application could find itself in the situation where an allocation
of X * Y fails due to fragmentation, whereas X allocations of Y would
have succeeded. We also try to discourage this practice when possible,
as there is code-maintenance overhead.
One way to help avoid memory fragmentation is to always free memory in
the opposite order from which it was allocated. In fact, if you were always
able to allocate and release in such an order, you would never have memory
fragmentation. Realistically, it’s not possible to follow this practice
all the time, but doing it as much as possible will help. Memory fragmentation
is going to occur, and at some point you will probably run into a situation
where it causes a problem in your game. You might have fragmentation that
is occurring in one part of your game that is causing a memory allocation
to fail in a totally unrelated area. Fragmentation might even manifest
itself as a situation where your game will fail only when arriving at
part of your game through a specific path (that causes the fragmentation).
If you don’t have some advanced form of garbage collection, you are going
to have to use other, more crude methods to limit this problem. One possibility
is to change the code that is causing the fragmentation to use a different
allocator so that it doesn’t cause fragmentation. A common way to accomplish
this is to have an allocator that allocates from the other end of memory.
Depending on your game, fragmentation can become more problematic over
time (especially if your allocations don’t have close locality of lifespan).
You can use brute force to minimize these effects, such as shutting down
and restarting code modules between levels as a brute-force garbage collection
technique.
Release builds.When running a release build, there isn’t any debugging
information in the game, as it will consume extra memory. But you still
need a way to know where the game runs out of memory. For Madden,
we assume in memory. If the game does run out, it will crash and optionally
dump some memory info. With a special build, we display some partial information
about the memory manager and the game state so that we can determine if
it ran out of memory because of fragmentation or other reasons.
Getting
on the Field
When talking about memory management, programmers often resort to words
more often associated with cow pastures than games. Terms like heaps,
stacks, and chunks are thrown around like flies buzzing around you-know-what.
To see how important a memory manager is to a game, you have to get past
the abstract poo. A good memory manager allows you to have more animations,
more characters, more textures, more sounds — in short, more of everything
that your game-playing customers love.
In this article we have described some of the issues that may come up
in writing and using your own memory manager. After years of writing and
rewriting memory managers for the Madden series, one piece of advice
that bears highlighting is simply to make sure you schedule adequate design
time on this very important piece of your system; you’ll be glad you did.
References
The Memory Management Reference Beginner’s Guide: www.memorymanagement.org/
articles/begin.html
Johnstone, Mark S., and Paul R. Wilson. “The Memory Fragmentation
Problem: Solved?” In Proceedings of the International Symposium on
Memory Management.ACM Press, 1998. pp. 26–36. www.cs.utexas.edu/users/wilson/papers/fragsolved.pdf
Knuth, Donald E. The Art of Computer Programming, vol. 1, Fundamental
Algorithms. Reading, Mass.: Addison-Wesley, 1973. pp. 435–444.