The idea behind this dependency-based strategy is quite simple. A dependency represents a link between a source asset and processed output file, indicating that the latter contains data that is provided (or affected by) the former. So, we say that the output file depends on the source asset, and that the asset is a prerequisite of the output file. This is a many-to-many relationship; one output file may depend on many source assets (consider, for example, a model file containing a mesh, textures, and animation data), and one source asset may generate many output files.
Dependencies are not limited to being simple links between pairs of files, either; if some files are built using intermediate files, or depend on other output files, then a dependency chain emerges, where each of the dependencies of a file may in turn have dependencies of their own. Figure 1 shows a simple dependency chain for a character model. If the entire asset pipeline was viewed, elements of this chain (for example, the run animation) might be used in other characters as well, and therefore have additional dependent resources.
Walking along the dependency chain for an output file, therefore, provides a list of all of the source (and intermediate) files that affect it, and hence may cause it to be rebuilt if they change. However, while this is a useful view conceptually, in practical terms it is usually more useful to look at dependency chains the other way around: for a given source asset, walking along the chain for its dependents will give a list of output files that must be updated if it is changed.
|Figure 1: A simple asset dependency chain.|
Dependency chains do not generally exist in isolation, either; chains frequently meet and overlap (for example, if one intermediate file or asset is used by many processes). This is actually another very useful property because in doing so, they provide all the information needed to minimize the amount of effort required to perform a single set of updates.
Consider the case where a number of source assets have all been changed. If each change is processed independently, and the individual dependents of the source asset updated, then some output and intermediate files may be updated several times. This is particularly problematic in the case where there are several "layers" of intermediate files depending on one another; in these cases, it is hard to remove the unnecessary updates because only the last update of any given asset is guaranteed to have a complete set of up-to-date intermediate files!
Figure 2 shows an example of this type of more complex dependency chain. The knight and paladin models share the same run animation, but have different base meshes. However, they both use the same texture page and therefore, the intermediate texture page file is shared between them.
|Figure 2: Dependencies on shared assets.|
The dependency chains contain the solution to this, as they store all of the necessary information about the relationships between the files to ensure that every file (both intermediate and output) is updated once only, but in the correct order to ensure that old data is never used. This is done by walking through all of the dependency chains simultaneously, and building a queue of the files that must be processed.
One very straightforward way to do this is to exploit the fact that the dependency chains themselves encode the order that operations must be performed in. To build the queue of operations is a simple iterative process, using a list of potentially modified files as the basis.
The first step of the procedure is to take every source file that has changed, and recursively walk down to all its dependents, adding each to the list (if it is not already present). After this step, the complete set of files that must be updated is stored on the list and the processing order can be determined.
This is done by repeatedly walking through the list and checking each file to see if it is ready to be processed. This is done by examining the files it immediately depends on; that is, those prerequisites that are directly linked to it. If any of those files is still on the list, then it cannot be processed yet, and is skipped. However, if none is present, then the file is moved to the end of the queue. This process is then repeated until there are no files left on the list. With this done, the queue contains an ordered list of the files for processing, so that every file is only updated once, and all of the prerequisite files are updated before each.
While in the majority of cases the files will be processed in a linear manner, and therefore this queue is all that is required for the operation to begin, it is also possible to produce output in a form suitable for processing many assets in parallel, for example, using a distributed network of machines, or a multi-CPU system. To do this, the same procedure is used, but with a marker added to the items on the list. When an item with no prerequisites is found, instead of being moved to the queue immediately it is marked and left in place. Then, when the end of the list is reached, all of the marked files are moved into the queue as a "batch." Each of these batches consist of files that are ready for processing but are also guaranteed to be independent of each other, so they can all be handled simultaneously if needed.
While in many cases analyzing the dependency information once and then processing the resulting queue of output files is enough; in cases where there are large numbers of changes being made to the source assets, it may be desirable to update the processing queue as changes are made. This can be done very simply by taking the current outstanding queue entries and adding the dependents of the newly-modified assets to them, creating a new list of files that need updating. Then, the dependency analysis procedure can be repeated using this input list to generate a new queue for continued processing, thereby ensuring that any changes caused by the new updates are correctly inserted into the processing order.
This technique can be very useful if the asset processing system allows multiple tasks to run concurrently, as it means that a single processing operation does not block the entire system until it completes, though unrelated operations may still be executed in parallel with it.
Determining Asset Dependencies
One of the key problems faced when implementing a system of this nature is how to actually construct the dependency information for the assets in the first place. The mechanisms for doing this will depend to a large degree on the processing tools and files being used, but there are some general areas that most techniques fall into:
Explicitly Stored Dependency Information
In some systems, such as the make tool which will be described in more detail later, the dependency information is stored as part of the script that describes all the desired processing operations. In general, this file is human generated, although dependencies can be specified for groups or types of files as well as individual assets, reducing the amount of maintenance required. This approach has the advantage that it is very easy to see and edit the dependency information, especially if it is necessary to add some special case entries for certain assets.
However, there are several fairly significant disadvantages of this system. Dependencies must be consistent across fairly large groups of files, otherwise a lot of manual editing is required. It is also impossible to encode dependency information that is based on the contents of the assets. So, for example, making a model file dependent on the textures it uses is impossible unless a human (or another tool) updates the dependency information by hand.
Dependency Information Stored in Assets
Another approach is to store the dependencies of asset files in the file itself. This way, the dependency information can be built by the exporter or tool that creates the file, based on the information it has about the contents. This makes this approach very suitable for handling assets such as models which may be formed from several separate files. It is also generally quite straightforward to implement, although a unified format for storing this information (either as part of the asset file, or in a separate metadata file) is required.
The main disadvantage of this approach is that it is only suitable in circumstances where the dependency chain for an asset can be easily predicted ahead of the processing itself, and is not likely to change often. This is because the information is generally needed to form dependencies for files other than the one it is actually stored in. For example, storing a list of textures used in a model file does not actually define prerequisites for the model file itself, as it is a source asset and has none. Instead, this information is used to construct the prerequisites for the processed file(s) created from this model.
Dependency Information Generated by the Processing Code
The final approach is to generate the dependency information "on the fly" by using the processing code (or a subset of it) to read each asset and build the dependency tree. This approach has the major advantage that it can easily handle very complex interdependencies between assets based on their contents, and it is relatively easy to maintain, once the initial framework is in place. Also, by building the dependency information this way, changes in the structure can be easily implemented, without having to edit external files, re-export, or reprocess assets to update their stored data.
However, the process of building the dependency information can be quite slow, and must be repeated whenever an asset changes. It also means that the dependency information is not easily visible for debugging purposes, or editable in the event that a special case change is required.
Of course, there is no requirement that only one of these approaches is taken; it is not uncommon to use a combination, picking the most appropriate technique for different types of assets or processing requirements. Dependency information from a number of different sources can be easily integrated into a single dependency tree for processing, and it is even relatively straightforward to remove all of the dependency information for a given asset or assets and re-insert it if changes to the asset that affect its dependencies occur during processing.
Determing When Assets Have Changed
The procedure for actually determining when an asset has been modified depends largely on the structure of the asset management system in use. If a version control system of some description is employed, then it is simply a case of either comparing the version numbers of each asset in the database with the last processed copy, or just retrieving the list of modifications in every changelist since the last update was performed.
On a flat file system, it is slightly more difficult to detect changes, although there are some methods that work relatively well. The most commonly used system is simply to compare the "last modified" date on each file, and check if it is newer than the last version that was processed (or newer than the processed output file, in some systems). This is not particularly robust, though, as it can be easily confused by actions such as "rolling back" files (by copying a previous version over the top), or if a machine's internal clock is wrong! It does have the major advantage of being very fast, and requiring little or no external information about the files.
Another, more stable method is to take a checksum of the files each time they are processed, and compare that against the stored copy. If a strong checksum or hashing system (the MD5 algorithm is a popular choice for this) is used, then the possibility of a collision, where two different files generate the same checksum value, occurring is infinitesimally small. Therefore, the check is a very robust way to determine if a file has changed. However, using this system requires that the entire source asset be read and the checksum calculated every time it needs to be checked, a fairly slow procedure.
If the file formats of the files being used are all under the control of the pipeline developer, or separate metadata storage is available, then one way to avoid this problem is to store the checksum in the file itself, thereby requiring only a handful of bytes to be read and compared to check for updates. However, it is comparatively rare that it is possible to do this for all types of asset files.
Another common compromise is to use both techniques, employing a simple timestamp based test for day-to-day updates, but performing a full checksum comparison on an overnight or weekend basis. This way, any assets that become "stale" as a result of an invalid modification date will be caught and fixed the next time a complete update is performed.
A less widely-employed, but useful in some circumstances, approach is to delegate the task of checking asset versions to the specific tools that perform the processing (sometimes after first checking the timestamp or checksum as an "early out" test). This allows the tool to perform much more fine-grained checking on the file, and determine which sections, if any need updating. For example, in the case of a game where levels are stored as a single large map file, it may be desirable for the map building tool to determine which sections have been modified and only update dependent files related to those, rather than the entire map.
The Make Tool
Probably the single most commonly used dependency-based build tool is make, a utility originally developed for Unix systems but later ported to just about every modern operating system. Make is available in every Unix distribution, and there are many Windows ports available, including direct ports of the Unix versions, and variants that are supplied with most compilers. Make's original purpose was to assist with compiling source code, but it is built in a very generic manner, allowing the invocation of virtually any command line tool as part of the process of converting input files into output files.
As such a generic tool, make is not able to use dependency information from the asset files themselves, and instead relies on an external file, known as a descriptor file that specifies both the dependencies between inputs and outputs, and the processing steps that should be performed on them. Make uses file timestamps to determine if a file is up-to-date, by comparing the last modifications of the source and destination files for each operation.
Descriptor File Syntax
Make's descriptor files are stored in a text file (typically called "makefile"), which is comprised of a series of rules, each of which defines the information needed to build a specific output file. The syntax is very simple: the name of the output file is supplied first, followed by a colon, and then a list of the prerequisites for that file. For example:
textures.bin : texture1.tga texture2.tga
specifies that the textures.bin output file depends on the two .TGA files listed. Therefore, if any of those files have a timestamp newer than that of textures.bin (or it simply does not exist), it will be rebuilt. The commands to build the file are specified immediately after the rule, preceded with a tab character to distinguish them.
textures.bin : texture1.tga texture2.tga
packtextures textures.bin texture1.tga texture2.tga
In this case, the command is simply executed "as is," and specifies directly the files to be operated on. However, make supports macros that can be used to allow rules to operate more easily on lists of files - for example:
TEXTURES = texture1.tga texture2.tga
textures.bin : $(TEXTURES)
packtextures textures.bin $(TEXTURES)
In this example, the list of input textures is defined as a macro, which is then subsequently referenced where it is needed rather than supplying the items explicitly. Macros are defined by supplying a macro name, followed by either "=" or ":=" and then the contents. If a variable is defined with "=", then it is a recursively expanded variable. Any reference to other variables will be kept intact in the macro and expanded every time it is used. If, on the other hand, ":=" is used, then it is a simply expanded variable. In this case, references to other variables will be expanded at the time the variable is defined, and the results stored instead. For example:
CHARACTERTEX = body.tga face.tga
LEVELTEX = grass.tga bluesky.tga
THISLEVELTEX = $(CHARACTERTEX) $(LEVELTEX)
LEVEL1TEX := $(CHARACTERTEX) $(LEVELTEX)
LEVELTEX = earth.tga redsky.tga
At this point, LEVEL1TEX will contain "body.tga face.tga grass.tga bluesky.
tga," as it was expanded before the definition of LEVELTEX changed. However, if THISLEVELTEX is used instead, then it will be expanded using the current values
of CHARACTERTEX and LEVELTEX, yielding "body.tga face.tga earth.tga redsky.tga" instead.
As seen in the previous examples, to reference a macro, simply surround the list name in brackets and prefix it with a $ sign (in other words, "$("). There are also some built-in macros (as well as more defined from the host machines' environment, such as the path to installed compiler tools), and a class of macros known as "automatic variables." These are automatically set up every time a command is executed, and contain information such as the target filename and the list of modified dependencies. For a full list of these, see the make documentation.
Make also contains various functions that can be referenced in a similar way to variables, and that similarly insert their results into the rule. These can be used to perform many useful tasks such as string manipulation and wildcard expansion (again, a full list can be found in the make documentation).
The rules in the definition file describe how to actually build the files referenced, but they will not actually cause anything to happen unless make has a reason to build the file. This will only occur if it is either explicitly asked to (by the user typing "make texture.bin," for example), or the file appears as a prerequisite in another rule that it needs to build (which, in turn, must have either been explicitly specified or invoked from a third rule).
In order to provide a convenient way to specify "top-level" rules that build a number of files, make supports phony targets. A phony target is a file that does not actually exist (and will never be created), but is always considered to be out-of-date. This can be used to write a rule solely for the purpose of triggering other rules, for example:
.PHONY : alltextures
alltextures : textures.bin textures2.bin
The ".PHONY" declaration defines that the target alltextures should be considered phony; in fact, even without this the rule would still operate normally. However, if for some reason a file called alltextures happened to exist on the disc, and it was newer than the source files (textures.bin and texture2.bin), then the rule would be considered to be up-to-date and skipped. Marking it as phony simply ensure that this can never happen.
In this case, the phony rule tells make that when it is asked to build alltextures, it should build the textures.bin and textures2.bin targets (because they are specified as dependencies). This rule can then be invoked by issuing the command "make alltextures" from the command line, or as a dependency of another rule, for example, a rule that makes all of the resources for the game. In addition, asking make to build the special target "all" causes it to build all of the top-level targets in the file (that is, all targets that are not prerequisites of another target).
As make was originally designed for processing and compiling source code, the early versions of the tool required every input file to be explicitly specified somewhere in the input rules. This is generally fine for programs, as the number of source files is usually relatively small, and additions are infrequent. However, this is not generally the case with game assets, and therefore maintaining a file that must contain the name of every single asset in the game soon becomes very unwieldy.
Fortunately, later version of make introduced a feature known as pattern rules. Pattern rules are a form of implicit rule (that is, a rule that operates on an entire class of targets, rather than an explicitly specified list) that allow a rule to be defined that is executed on every target whose name matches a specified string pattern. This way, rules that operate on specific types of assets can be easily built. Pattern rules follow exactly the same syntax as normal rules, except that the % character is used to indicate "one or more arbitrary characters" in the names specified. For example:
%.tex : %.tga
converttexture [email protected] $<
This rule enables any target with a .tex extension to be built from a corresponding .tga file. The [email protected] and $< entries are automatic variables that correspond to the name of the target file and the source file for the rule, respectively. For example, if the target file grass.tex exists, then this command would expand to "converttexture grass.tga grass.tex."
It should be noted that, like all other make rules, this does not actually perform any actions unless another rule references a file matching it.
Therefore, what is needed as the logical companion to pattern rules is some mechanism for specifying groups of files as the prerequisites of a target without actually listing them. This can be very easily achieved through the use of wildcards, for example:
alltextures : *.tex
This rule causes all of the files with a .tex extension in the current directory to be built (using whatever rules are available to do so, such as the pattern rule given above) when the alltextures target is referenced. However, this rule will only update existing files so that if a corresponding .tex file for the asset does not exist, it will not be built. Note also that while wildcards will be automatically expanded if they appear in a target or dependency list, in a variable declaration they must be explicitly expanded by wrapping them in the built-in wildcard function, "$(wildcard $.tex)," for example.
This is where the string manipulation features of make come in handy. Since what is actually required is not a list of the output files that exist, but a list of the output files that should exist, we can build that list by taking the list of input files and changing the extensions; we know that in this case, every .tga file should generate a corresponding .tex file. This can be done with the following rule:
TEXTURELIST := $(patsubst %.tga,%.tex,$(wildcard $.tga))
alltextures : $(TEXTURELIST)
This rule uses the patsubst function, which performs a pattern substitution. The first argument is the pattern to match (with, as before, % indicating any sequence of one or more characters), the second is the replacement pattern, and the third argument is the input data, in this case, taken from the list of .tga files generated by the wildcard function.
This pattern substitution has the effect of creating a list of the target files, by stripping the .tga extension and replacing it with .tex. The creation of this list is placed in a variable definition to improve performance. Since the variable is defined as being simply expanded, the wildcard and pattern substitution operators are only evaluated once, and then the resulting list is stored for re-use.
With this, it is possible to build make files that take source assets of different types from various locations, and build them as required without having to provide explicit rules for every single asset. However, there are often cases where it is desirableto be able to do just that, for example, when there is one texture that looks poor under the default compression settings, or if a special-case is needed to handle the player's character model differently from other NPCs.
Fortunately, make provides a very convenient mechanism for doing this. When searching for a rule to build a specific target, make will always use a rule that explicitly names that target if one is available, only examining implicit and pattern rules if none is found. Thus, even though a rule exists that specifies how to build .tga files into .tex files, if another rule is written with a target of player.tex, it will be used to build that file rather than the more general rule. If the same target is explicitly specified by two rules, make will generate an error.
Advantages and Limitations of Make
Make is a very powerful tool, and the description given here only covers a relatively small fraction of the available functionality. Make is very widely used, and has been tested on many large scale projects. There is even (albeit somewhat primitive) functionality included for running multiple tasks in parallel to improve performance on multi-CPU machines. The descriptor file syntax is somewhat arcane at first sight, but is quite readable and can be easily read and edited by both humans and other applications. In particular, it can be very useful to use external tools to generate portions of these, as a mechanism for encoding dependency information from asset files.
Make works very well on Unix systems, where just about any conceivable task can be achieved through shell scripts or other command line tools. However, on Windows systems, less basic functionality is available to command line programs. In practice, though, this is a relatively minor hurdle. Most of the important tools for asset processing can be command line driven (or must be written in-house), and the other "glue" utilities can be fairly simply replaced or rewritten.
The main disadvantages of using make are that it only checks for file modification through the file timestamps, and adding support for more complex dependencies (such as those based on asset contents) can be complex, and require many custom tools to build additional dependency information in a format make can understand. Also, make has no native support for integrating with asset management systems, it works strictly from a local filing system. Therefore, for most purposes some form of external program will be needed to handle the task of getting asset updates from the database and invoking make when required to perform the processing tasks.
Building Robust Tools
One of the key requirements of any asset processing system is that it must be robust under as many conditions as possible. Various mechanisms for dealing with broken source assets were discussed previously, but little mention was made of the steps the tools themselves can take to make sure that they fail as infrequently as possible, and that failures are handled sensibly.
Be Lenient in What You Accept, but Strict in What You Output
When writing any system that must interoperate with others outside your control, this is a good mantra to adopt. While your internal file formats will only be seen by a small number of programs, quite likely all written by one person or based on the same source code and libraries, when handling files created by or for the use of external applications, it is necessary to allow for a wide variation in the interpretations of the format specifications.
Most common file formats have been reasonably well documented, but even the best documentation still leaves vague areas or places where the precise behavior is deliberately left undefined for some reason. In some cases there are several sets of (often conflicting) documentation, or even worse, none at all. In these cases, a useful addendum to the above is "expect the unexpected." If it's at all possible within the basic structure of the format, chances are someone will have done it.
In recent years, many specification documents have adopted an "RFC-like" ("Request For Comments" documents are a set of publicly available technical notes, mostly defining protocols and standards for Internet use) style when describing the behavior expected from applications. Many RFC notes use a common set of strict definitions of the words "must," "should," and "may" to avoid any possibility of misunderstandings. These definitions are as follows:
MUST: indicates something that is an absolute requirement. For example, "the index field MUST be an unsigned 32-bit integer."
SHOULD: indicates that there may be valid reasons that this requirement can be ignored, but applications should not do this without first considering the consequences of doing so. For example "the header SHOULD include the name of the source file."
MAY: indicates that this requirement is optional, and it is up to the application to decide if it should implement it or not. For example "this block MAY be followed by one containing additional metadata."
From the perspective of an application reading a file that has been specified in such a manner, it is "safe" to assume that any compliant application will have implemented any "MUST" requirement, but the possibility that "SHOULD" or "MAY" requirements have not been met must be taken into account. However, as the description states, when writing to a file, unless there is a very good reason to do otherwise, "SHOULD" requirements should be met. This ensures the maximum possibility that another application (which potentially may have ignored these rules) can read the file correctly.
Regardless of these specifications, it is good practice to perform sanity checks on values read in from any file if there is the potential for them to cause significant harm (for example, indices that are used to reference arrays or the sizes of structures. In particular, one point worthy of special mention is that many formats do not explicitly state if values are signed or unsigned (and even when they do, this is often ignored). This can lead to serious problems if a negative value is inserted, as it will appear to be a very large positive number when read in an unsigned fashion (and, indeed, vice versa). Performing bounds checking on input values can help catch these problems quickly.
Handling Tool Failure
In addition to performing input validation, some mechanism for reporting failure in tools is also required, to handle situations where the data is sufficiently broken that recovery is impossible. Ideally, this should allow the system to take a suitable response to the broken data by rolling back to a previous version of the source asset file and retrying the processing step.
When reporting an error, it is generally best for the tool to supply as much descriptive information as possible about the problem. This can then be logged by the system, and used to diagnose the fault. If an e-mail server is available, then e-mailing this information to the pipeline or tool's maintainer is often a good idea. This way, they can often immediately see what the cause of the problem is, without having to actually track down the offending file's log entry.
The error report should also include, in machine-readable form, the names of the input files that caused the problem, and the state of any output files that were altered by the processing. This will enable the pipeline to perform the necessary recovery actions, removing or replacing the (potentially) corrupted output, and finding an alternative set of input files if they exist. It is particularly important that this is done whenever possible on more complex processing operations, as they may involve a large number of files, and in the absence of this information the pipeline may have to assume that all of the input or output files are potentially invalid.
When actually logging the error, this information can be added to by the pipeline; the most useful additional information is that which allows the problem to be recreated. In general, this will include the names and versions of all of the input files for the tool (and the tool itself), the command line it was executed with, and any other relevant system information such as environment variable values, available memory, and so on. With this, when a problem occurs there is a good chance that the fault can easily be recreated in a controlled environment (such as under a debugger).
In circumstances where tools are frequently being called with ephemeral intermediate input files, it can even be useful to store a copy of all the input data for the task that caused the error along with the report. This way, it is not necessary to perform all of the preceding processing steps to recreate the problem, and if the failure was due to a fault in an intermediate tool the broken data will be available to examine.
Another technique that can make debugging asset problems much simpler is if all the output from the tools is archived in a consistent location, for example, in a database or directory structure that mirrors the layout of the files in the pipeline. This way, for any given intermediate or output file (even if the pipeline detected no errors), the debug output can be quickly located. This can be very useful when trying to diagnose problems the pipeline has missed, such as "why is this model ten times smaller than it should be?" With some care, it can also be used to gather detailed statistics about various parts of the pipeline, such as the average performance of the triangle stripification or the distribution of mesh compression schemes.
This output redirection can be done on an individual tool level, but it is generally more useful to implement it as part of the overall pipeline functionality. This can be easily done by redirecting the standard I/O streams, and, if necessary, hooking the debug output functions (OutputDebugString() on Windows systems). Handling the redirection in this high-level manner both reduces the amount of code required in each tool, and provides redirection for third-party utilities or other similar "black box" components.
Regardless of the amount of protection in place, however, there will always be cases where either as the result of an invalid file or simply due to a bug, one of the processing tools crashes completely. These situations can be very difficult to deal with, because in the absence of detailed debugging information, the only solution is to run the offending tool under a debugger to find out where the crash occurred. This is quite time consuming, and in mature toolsets often leads not to an actual bug, but rather to an unexpected set of conditions in the input data. Having as much debugging information as possible can help considerably here.
One particularly nasty class of fatal error is that where a tool enters an infinite loop. This is quite hard to detect, as no actual error occurs, but instead the processing never completes.
The basic mechanism for detecting infinite loops is to implement a timeout, whereby the tool is forcibly terminated if the processing takes more than a specified amount of time. However, this time can vary wildly among different tools. For example, if a texture resizing operation takes more than a few minutes it has almost certainly crashed, but calculating lighting information for a large level may easily take a few hours to normally complete. Therefore, some amount of manual tweaking of timeouts will usually be necessary to avoid terminating tools prematurely.
Another mechanism that can be used to assist in detecting infinite loops is to allow the tool to expose a "progress meter" to the pipeline. Essentially, this is just a value between 0 and 100 (or any other arbitrary value) that indicates how far through the processing the tool has got. The pipeline can then implement a timeout that triggers if no progress has been made for a certain period of time, or if the progress meter goes backwards (a fairly sure sign of a bug!). This approach is more efficient, both because less manual tweaking of timeouts is required, and it is capable of detecting crashes that occur early in long processing operations without having to wait until the full time limit expires.
A progress meter can also be a useful tool for other purposes, such as judging how long a process is likely to take, and as a means of preventing human induced crashes when someone decides that a process has taken "too long" and kills it manually!
Debugging the Pipeline
If detailed logs are kept of tool execution, then many problems can be diagnosed simply by examining these, especially if the failure was caused by an assert() statement or some other situation that the code was able to catch and respond to. However, there will always be cases where a crash in a tool must be debugged "directly," by examining the code executed up to the point of the failure.
Regardless of the circumstances, whenever a crash in a tool is detected, if at all possible, either the tool itself or the pipeline should attempt to write out a stack track, register list, and possibly a memory dump; on most operating systems there are relatively straightforward functions provided for doing these. This can be absolutely invaluable in debugging hard to recreate problems, because all the information that can be obtained from viewing the crash in a normal debugger can be gleaned (with a greater or lesser degree of effort) from the dump information. Some debuggers even allow crash dumps to be loaded and viewed directly as though the crash had occurred locally, making the process even more efficient.
If a crash dump is not available, or the problem cannot be diagnosed from it, then it will be necessary to recreate the circumstances that led to the failure. This is where the detailed execution environment information the pipeline should report in the log file comes in useful. By retrieving the versions of the input files specified, and re-running the tool with the same command line and options, it should be possible to cause the crash to happen again. This is essential both for diagnosing the problem and then verifying that it has indeed been fixed.
As with any complex system, any asset pipeline will always exhibit a few bugs that cannot be reproduced in a controlled environment, or may even disappear when the pipeline runs exactly the same processing operation a second time! These are often due to the precise timing between events (this is particularly an issue if multiple processing tasks are being executed simultaneously), the layout of memory at the time of the failure, or simply hardware or OS faults.
As recreating them is nearly impossible, debugging such problems is almost always possible only with detailed logs and crash dump information. Even worse, it can be very hard to prove that such a bug has been fixed: sometimes changing another unrelated section of the code can cause it to disappear, simply because the sequence of events that revealed the problem now occurs more rarely.
There is little that can be done to mitigate these problems, except for ensuring that all of the tool code is as robust as possible, and that the maximum amount of available information is gathered when a crash does happen. In the worst case scenario, it may be necessary to run the entire pipeline in debug mode or under a debugger to find the problem. Although, it is worth noting that in some rare cases this additional instrumentation can prevent the fault from occurring!
Maintaining Data Integrity
Aside from producing as much information as possible to help locate the problem, the other main task of the pipeline when a crash occurs is to recover as safely as possible, and continue in as normal a manner as possible. A critical part of this process is ensuring that any data files that were modified by the tool that crashed are safely removed or reverted to known good versions. Otherwise a single error can cause a cascade of failures as each successive step in the pipeline tries to use the corrupt data output by the first tool!
This "clean up" process mainly involves removing any temporary files that were created and deleting or invalidating output data that may be truncated or corrupt. Having a dirty flag in the file headers can be a big help here, as it allows partially written files to be easily detected. If checksums of files are being stored for the purposes of detecting changes, then these, too, can be used to detect modifications.
Modified intermediate files can either be deleted entirely and then recreated by re-running the tool with the previous set of input data, or by replacing them with the last versions directly (assuming that these are stored somewhere). Either approach works well, although the latter is generally preferable where possible as it reduces the amount of time needed for the recovery operation.
Another possible approach to take to ensure the integrity of data in the pipeline is to "sandbox" each tool's execution. In this case, the files the tool may modify are copied prior to its execution, and the tool operates on those copies. Only once the task has been successfully completed do the original files get overwritten with the updated versions.
This approach makes sure that an errant tool cannot corrupt files when it fails (clearly, no such guarantee can be made if the tool claims to have executed successfully), and for further safety all of the tool's input and output files can be moved to another directory before processing, thereby ensuring that no files other than those specified as outputs can be accidentally modified. In this case, the truly paranoid can even take the step of making the rest of the pipeline data unwritable to the tools if desired. Sandboxing the execution in this manner is a very effective safeguard, but it does introduce additional overheads in the execution of each processing step.
Dependency analysis plays a vital role in improving the efficiency of asset processing operations, by ensuring that only the files directly affected by each change to the source assets are updated. As this is a common problem, particular in source code compilation, there are many existing tools available that perform this task well - the make utility being the most popular.
Another important prerequisite for building an effective asset pipeline is a strong framework for tools, and well-defined file formats for interchange of information. The effort expended on getting these aspects of the system right is well worth it, as they will have an effect on virtually every stage of the process. Wherever possible, common functionality should be integrated into this framework, speeding the development and improving the robustness of every tool based on it. Isolating tools from each other as much as possible is also a useful technique for ensuring that failures in one section of the pipeline do not affect others.
This article is excerpted from The Game Asset Pipeline. (ISBN # 1-58450-342-4). For more information about the book, please visit http://www.charlesriver.com/Books/BookDetail.aspx?productID=88993.