Sponsored By

In-depth: The value of Valgrind

In this reprinted <a href="http://altdevblogaday.com/">#altdevblogaday</a> in-depth piece, Electronic Arts software engineer Maximilian Burke explains the value of the Valgrind tool suite, and how it can be useful for tracking down memory-related bugs.

Game Developer, Staff

March 7, 2012

4 Min Read

[In this reprinted #altdevblogaday in-depth piece, Electronic Arts software engineer Maximilian Burke explains the value of the Valgrind tool suite, and how it can be useful for tracking down memory-related bugs.] My team at work has been working on porting our core technology stack to a variety of platforms over the last long while. The total supported platform count for us currently is fifteen, with many more coming soon. The biggest benefit to come of all of this, however, is an increase in the quality of our code. In addition to dealing with the peculiarities of over fifteen compilers, especially around how some compilers deal with aliasing, comes a great exposure to extra tools. Supporting Xbox 360 gives you access to Microsoft's static code analyzer, and Mac OS and Linux both provide access to clang's static analyzer and valgrind. Valgrind is a collection of tools based around a VM, not entirely unlike the JVM or the .NET CLR, but where the opcodes are the native instructions for your platform. The Valgrind runtime breaks down the instruction stream into its own SSA format which its plugins can then operate on. One plugin, memcheck, has been especially informative as it keeps track of memory that is initialized (both stack and heap), memory that isn't, reads/writes beyond allocation boundaries, and ensures that calls to certain standard library functions (like strcat, strncat, strcpy, strncpy, memcpy) are conformant in regards to overlapping memory regions and necessary destination sizes. Why is this handy? Consider this code:

int AddThree(const int *a)
{
    return *a + 3;
}
int Foo()
{
    int a;
    return AddThree(&a);
}

Most compilers will run right by this code without seeing any problems. Down the road when the result of Foo() is used, Valgrind will point out that the memory was uninitialized.

==22064== Syscall param write(buf) points to uninitialised byte(s)
==22064==    at 0x2C11BA: write$NOCANCEL (in /usr/lib/system/libsystem_kernel.dylib)
==22064==    by 0x17B59D: __sflush (in /usr/lib/system/libsystem_c.dylib)
==22064==    by 0x1A6F6C: __sfvwrite (in /usr/lib/system/libsystem_c.dylib)
==22064==    by 0x175990: __vfprintf (in /usr/lib/system/libsystem_c.dylib)
==22064==    by 0x17118D: vfprintf_l (in /usr/lib/system/libsystem_c.dylib)
==22064==    by 0x17A2CF: printf (in /usr/lib/system/libsystem_c.dylib)
==22064==    by 0x100000E26: main (test.cpp:26)
==22064==  Uninitialised value was created by a stack allocation
==22064==    at 0x100000D70: Foo() (test.cpp:10)

Unfortunately it's much more memory and CPU intensive, with memcheck your program will run 20-30x slower than normal, other plugins are even more demanding. It also doesn't get the same info if you use replacement standard library functions, like the aforementioned strcpy, strncpy, strcat, strncat, and memcpy. Although Valgrind provides macros for indicating to the runtime the state of memory blocks for your own custom allocators they don't seem to work quite as well as the ones they provide for the system malloc/free functions. In addition to memcheck, Valgrind comes with a couple other plugins. Cachegrind profiles how your program interacts with the processor caches and tracks branch (mis)prediction. Here's some sample output from a naive matrix transposition function, including all the stats on a translation unit and function level. The 'I' stats are for the instruction cache, the D stats for data cache. The 'r', 'w', 'mr', 'mw' are counts of read hits, write hits, read misses, and write misses. The data cache data is split between level 1 and last level which could be L2 or L3 depending on your processor's architecture. As well there is callgrind which is similar to cachegrind but generates call graphs as well, helgrind and DRD which help detect errors in multithreaded code such as data races and incorrect use of threading primitives, massif and DHAT which profiles your heap (and stack) usage. There is an experimental tool called SGCheck which aims to detect global and stack array overruns. Even though I've barely scratched the surface of what Valgrind is capable of, I've found it to be immensely useful when tracking down memory related bugs. Because it's free and really easy to use — no special libraries required, just run 'valgrind foo' — it's easy to promote the use of to others on your team, and also easy to hook into your automated tests if you have any. Besides, one can never have too many tools at their disposal! [This piece was reprinted from #AltDevBlogADay, a shared blog initiative started by @mike_acton devoted to giving game developers of all disciplines a place to motivate each other to write regularly about their personal game development passions.]

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like