[Experienced tool developer Cyril Marlin outlines a method used to build an automated testing system for Wizarbox's SoBlonde, a Wii adventure game -- which reduced bugs, increased polish, and is flexible enough to be adapted to other genres.]
A game solver can find a way to play an adventure game to the end, giving help to testers, as you can see in this video, testing SoBlonde for the Wii.
In game development, one task that is time consuming is testing. There are many documents and tools covering unit tests, however here we'll cover higher-level tests and their problems.
Testing is a highly repetitive task, and many use cases should be checked. Each modification has a risk of breaking the game: engine, high level logic, or resources are each a potential source of problems. Furthermore, the diversity of situations adds an additional problem, since each change should be tested in every situation. This can add up to a lot of time.
It is imperative to test as soon as possible, and as completely as possible, because the later a bug is detected, the bigger an effect it will have on the game. Indeed, once checked into the game's source control system, the bug will be start to be mixed with other developers' changes.
If so, when discovered, it will take longer to find and fix, as its resolution may conflict with the work of others. Finally, the larger teams needed for big projects increases the number of concurrent bugs in a project.
This can sometimes be solved by specific testers during the whole project, but the repeatability of the task has serious implications on the team. Bad habits can slip in, such as only running a single path to complete the game, and/or skipping some parts. In these situations, testing may not find bugs that exist, but the cost of testing remains the same. Worse, testers can have a hard time paying attention, thanks to the fact that the task is too simple or repetitive.
However, as I have already said, continuous tests are necessary, because without them, the source control system is likely to become corrupted and impact other developers.
Sometimes we can use a hard-coded test script, which contains all the actions to perform. But this method is far from ideal, as it will often have problems:
- Linear testing. Actions are always performed in the same order.
- Cross quests. A test script can hardly take account of variations, or optional side quests.
- Up to date. If developers "forget" to report modifications of the adventure, the script may fail.
- Dependent upon development completion. It is difficult to test until the complete adventure is implemented.
- Logic. A change to a quest in one chapter may have consequences in another part of the game.
As a solution, we automated some aspects of game testing, including:
- Integration testing on the entire game (load the full data set).
- Dynamic tests: the test will automatically adapt to changes during game development.
The objective was to save time during development by removing the costs of creating and maintaining test scripts. Additionally, because testing has been performed throughout development, more potential bugs will have been found prior to entering QA.
The Script Engine
The creation of a solver is closely linked to the data representation and logic operation of the game engine. We will describe below the scripting language that was developed to meet the production needs of the game SoBlonde: Back to the Island. It is a very simple script language, based on a set of C macros: The script associates, for each scene event, a list of actions to perform, with an
- ID (among Activate, Look, Talk, Use, Touch/Untouch, Combine and SceneInitialisation)
- The concerned entity.
And for every action in the list:
- ID (Move, Rotate, Play Animation, Set Visible or not, Add/Drop Item, Set Scene, Set Var...)
- The concerned entity.
The script system listens for events generated by the user or the game, and executes the associated action list. We can conditionally run of a sub-list of actions with conditions based on scripting variables declared globally. The fact that there is a global registry of variables is of importance, as we will see later. Here is an example:
// shell has not been captured and bernie is on left
PlayAnimation(ID_ACTOR_SUNNY, ANIMATION_SUNNY_SHRUG, false, false);
PlayAnimation(ID_ACTOR_SUNNY, ANIMATION_SUNNY_IDLE, true, false);
Parallel execution (as seen in the game) is out of our scope. We associate the "Activate" event on the entity BERNIE's to actions with condition BG01_PICK_BERNIE equal to 1.
The scripts are integrated in the game as modules: one for each scene (a scrolling screen) for easier editing. Being written in C, they are always available in memory. The entities, however, are not saved; their parameters (position, visible, interactive, etc.) are defined in the scene initialization stage. All variables are globals, so a game state is simply the script variables plus the content of the inventory.
The end result is very simple, and allowed rapid development of the game.
The solver is implemented as a breadth-first tree search. It will explore the states of the game, until we met our goal of approaching the end of a quest. These goals will be extracted from the quest book, since the finishing point of a quest contains the associated variables and matching values. States represent nodes in the tree. Events are branches that can be followed, to reach a new state.
It will therefore list the events of a scene, taking one and running it to build a new state of the game. This process will be repeated until a goal is reached or a search depth becomes too great. When a goal is reached, the tree is followed in reverse order to get an efficient way to reach it.
The search is done by brute force, so it is necessary to lighten up its execution. First, we need to change the architecture of the game to isolate the logic: we split the operations of the script from the rest of the game to allow their execution without loading the corresponding scene resources or the corresponding entities.
We load all the logic information of the game at one time (global information and information of each scene: the number and identification of entities and their properties) at the beginning of the solving. We create an object that gathers information on the state of the game, GameState. We can then proceed to the main loop:
1. Test end of recursion: goal achieved, depth reached;
2. Test special cases: Scene initialization;
3. Test special cases: Check combination of inventory objects;
4. List the events of the scene;
5. For each event:
- Test event parameters (items available in inventory);
- Execute the event;
- Recursively call solver with new state (going to step 1.).
This approach quickly showed its limits and many problems arose.
The first problem was the state of entities.
We assumed that the script can interact with any entity in the scene, but this is not always the case. An entity may be invisible or non-interactive, and this can change during the course of the game. The entities states are not present in the game state. Indeed, when we load a game state, the entire scene is reset, as if the player had just entered. Consequently, before each event's evaluation, we must do scene initialization. However, this solution also exhibits several problems:
- If the script is bugged, scene initialization may fail and the solver may get stuck in a loop (this is a behavior we can wish for under some circumstances -- but it is not our goal in this situation).
- Initializing the scene systematically associates an introduction sequence that must be executed, so the solver will spend time on useless actions.
The solution was to add the entities' state information in the game state, in compact form, to avoid doing again scene initialization after each step.
Another batch of problems came from the fact that the solver was quickly lost in useless calculations. After a quick profiling, we identified several simple optimizations:
State loop cutting. The brute force approach will fail very quickly if we waste time. We are looping on previously explored scenes during the search; we return to a scene already visited, even without game modifications. Our solution: we use a table containing a fingerprint (hash) of the game states already explored. This worked very well. Hash was faster than a full memory compare. We stop the search if we match a previously explored state.
No side effects cutting. We stop the search if an event had no side effects. This way we avoid searches done in other branches. In the same way, but this time with static information, we can check if an event's actions have a possible side effect under any conditions. If not, we filter the events that have no side effects.
Solving dialogue. In our game, dialogue is separated from the script system; it is handled by a different subsystem. However, they have the ability to check and change the game variables, hence changing the game state. Dialogues can also be seen as a tree: questions (propositions) are branches that may lead to other branches or sentences (or leaves). Some propositions are available only under some conditions. Variables can also be modified when a line of dialogue is shown.
The solver quickly finds itself stuck in the dialogues. One solution is to set up a second solver which will work with the same game data, but produce specific actions in the associated context. This solution is attractive, but necessitates some heavy work. Furthermore, only few cases were a problem for the game solver.
We used a simpler solution: the dialogue is completely covered, sequentially. All propositions will be selected, one by one. In our game, the exit of the dialog was easily found, as it was always the last proposition. This approach worked, except for the very special case of puzzles in dialogues.
Some puzzles are solved in a dialogue, but are not very common. For example, we had a story split into parts, and we needed to select the right proposition in the right order. We also had dialogues that triggered re-initialization of the scene, leading to an endless loop. We have solved these few cases by hard-coding them. This may not work in every case (the solution must be static; using a dialogue for scene navigation will fail here), but it proved simple and reliable in our case.
Fedex quests. In some cases, we must walk through several scenes to complete an action, repeatedly. This quickly overflowed our search depth limit. This is a problem that may also affect the player, and we have solved it by using the same trick as used in the gameplay. We used the custom object "map" that is obtained in the third chapter of the game. We add the ability to shorten displacements using map's hotspots. This solution is interesting, as we show that the CPU's solving limits can be the player's limits, too.
Goals that are too far away from the current point. Sometimes it takes many actions before you can open or close a quest. The quest book gives us some goals far away from our game state. To solve this, we added the content of the inventory as mid-objectives; each inventory change becomes a goal accomplished. Since we are a point 'n' click game, gathered items are meaningful.
Inventory loop and quest loop. This problem is related to the state loop, in that unfortunately, some items can be consumed, and obtained again with very little action. Plus, some items are stackables -- up to 99 of an item.
We first tried to solve this by only using dropped items as goals. However, this will shortcut the cases where we must drop an object to continue the adventure. The solver will add it and remove it instantly (as a new short goal) in a loop.
As a solution, since only a few items of the inventory can lead to this problem, we chose to blacklist theses. In the same fashion, some quests can be reopened after being closed. A similar blacklist solution successfully removed these cases.
Another idea is to keep a full solving history through fingerprints, and stop the search if a previously explored game state is met. We didn't try it, as blacklisting proved sufficient. However, this may be an interesting development, avoiding hard coding.
Finally, here are some numbers from the game:
- Backup (game state) size: approximately 2kb
- Number of scenes 70
- Number of characters 75
- Number of Boolean variables 1272
- Number of integer variables 49
- Branching factor: about 3.5
- Maximum depth search: 17 (may be quite long on a Nintendo Wii)
- Test time: about 30 minutes to complete the adventure (with game sped up 4x) on a Nintendo Wii.
As we saw, the execution of a game solver requires the processing of game logic data. This leads to load multiple levels / scenes together (if not all), so there may be some significant changes in the game engine. Furthermore, this technique is depends of the use of the quest dictionary to search.
For another type of game, another provider for goals has to be found. This method allowed a continuous logic test of the game. This gave us a very high value at a reasonable cost. We tested continuously from early in the development phase to final test phase. The final test phase has far less bugs from this method, and reduced the time to polish.
As an extension, we added the ability to dynamically create variants of the adventure by adding random events. The solver is able to resume execution from any state of the game. A wide variety of solutions, including testing non-linear level design is now possible at low cost! It makes better use of testers: they are no longer blocked due to a crash on the critical path. So we can spare time on bug fixing.
This method can be combined with other methods like those seen in "Mario AI Competition" by Sergey Karakovskiy Togelius and Julian, to solve skill-related gameplay. A pathfinder could also be useful if some events are locked by spatial access. Many action or adventure games could use such tools to do automatic test.
To conclude, the good performance of this solver gives the possibility to use it to help the player in his quest if blocked, similar to the "super guide" that can be seen in New Super Mario Bros. Wii. The advantage is to use the player context: in an RPG, the solver will suggest different solutions if the hero is a barbarian or a magician.
Another possibility is full tree exploration. This way we can certify that no sequence of actions can lead to blocking the main quest. We can also collect some metrics, like depth from quest solving, identifying hard quests to solve.
Thanks to Wizarbox, Sébastien Duperron, Aurélien Pocheville, and Claire Roger.