This is the second in a three-part series of articles detailing how we designed and deployed usability testing for our latest iOS game, A Clockwork Brain.
[This article was originally published in the Total Eclipse blog]
The research, design, and deployment of usability testing took one month from start to finish. Prior to this, none of us had any experience with designing formal usability testing. I, myself, have had some experience in questionnaire design and facilitation of experiments, based on previous work in university research.
The first article explained our choice of hardware and software and detailed the set-up costs. This article examines the game itself and explains its usability testing procedure.
The following topics will be discussed:
- Knowing your game.
- What kind of players we wanted to invite and how we recruited them.
- Discovering what to test.
- Designing the first (of the two) usability scenarios.
- Using the iGEQ questionnaire and open-ended questions, during testing.
Know your game
We had no idea how to proceed in the beginning. After reading articles for days, we discovered:
- A ton of information on web usability that could be helpful.
- A lot of academic articles on games usability. Most of them were found to be too academic-oriented and with no immediate practical application in our case. Some exceptions did exist, which are referenced below.
- A handful of post-mortems with clear how-to’s on games usability by games studios themselves (or contracted usability companies).
We found that answering the following key questions proved very helpful with the subsequent design.
How far is your game from completion?
We decided to test a couple of months before Beta, so most artwork and mechanics were completed by then. There is no reason why one should not do a usability test at a much earlier stage. In fact we think this would be very beneficial in resolving disputes between different designs. You need to know the stage of completion simply because what you have is what you can test.
With careful planning you could organise more than one usability sessions at different stages of your production.
Who are your players?
Knowing your prospective players means knowing whom you should invite to play-test your game, right? Well, that is what I read in most usability articles but I do not take it as gospel. I think you should also invite some anti-players if you want your game to have a very broad reach.
A Clockwork Brain is a series of puzzle games that target pretty much everyone. It is imperative for us that the controls can be quickly understood and, even if someone does not read tutorials (many people don’t), she should quickly grasp what’s going on. Because it’s an iOS game, our targeted players should be ones that own such devices and thus would be also likely to download our game. However, we also brought in people who had never used a touchphone in their lives. We also invited people who we knew love hardcore games and dislike playing games on their touchphones.
The feedback you get from people who have no experience with your game’s genre, or with games in general, could be extremely helpful and a real eye-opener.
How many players should you invite?
Nielsen suggests that 5 users is the best number for a usability session. As one adds more and more users, he learns less and less. If you have 10-12 users, it is much better to split them in two consecutive usability sessions: do the first, fix stuff, do the second and test what you fixed. Others suggest 8-10 users per session. In any case, consider that you must have some extra users handy and in particular:
- At least one for each mock usability session(s)
Before you unleash the usability design onto your worth-their-weight-in-gold players, you should test it. The mock session will verify that the timings you have estimated are correct, the procedure has no holes, and that everything runs smoothly. Obviously don’t use someone who has worked in designing the usability session. If your mock session goes according to plan and your user is also a prospective user of your game, you will have valuable feedback before you even begin. That’s what happened with our mock sessions – most of the feedback we got on that day we also got on the official runs.
- People to invite in case of no show
Someone might cancel – this should not affect your session. You should have a list of people you can invite in their place. For our two usability sessions we had 5 players in the first, 6 in the second and two more for each mock run. We should stress here that 5 players is a good number if you don’t want to differentiate between the players. For example, if you wanted to test the behaviour of adults vs. the behaviour of children then you would need two groups for each session, one with children and one with adults. For this scenario, Nielsen suggests:
- 3-4 users from each category if testing two groups of users.
- 3 users from each category if testing three or more groups of users (you always want at least 3 users to ensure that you have covered the diversity of behaviour within the group).
How will you recruit your players?
It’s best to plan forward and include questions that will help you select players for future projects, as well. You need to know typical things like age and gender, but also what access to technology a player has, how often he/she plays games and on which devices. This way you can also broaden your reach of possible testers; just 40% of those who completed our survey were located in Greece. Lastly, this allows you to have each player’s demographic info before he/she arrives at the session – so the player won’t spend precious time completing demographic questionnaires, and will instead focus on what matters: the game.
By the way, make sure that you reward the players, give them an incentive. In our case, each usability participant was rewarded with two of our games plus two shiny game posters.
What do you want to test?
This is a very hard question to answer and requires serious thought and planning because it will define the structure of the usability session itself. For A Clockwork Brain, we found it helpful to describe the game as concisely as possible and start from there. You can read the description here. In summary, it identifies:
- Free to download Brain-training game with a Steampunk look and feel.
- Platform: iOS with in-app purchases.
- A Token system that rewards players as they play.
- Eleven mini-games testing various skills (such as logic and arithmetic): 4 free, 6 bought through in-app purchases and one unlocked with Tokens.
- One-tap control for every mini-game.
- The players have to give as many correct answers as possible within a minute. If they do well, the levels become harder. If they answer wrongly the tasks become easier.
- Two modes of play: Challenge (4 mini-games in a row) and Single Game (play any game you want). Both give Tokens as rewards and submit scores to local & global Leaderboards.
Just by doing this we realised that we wanted to test two different aspects of the game:
- The mini-games themselves. Does each mini-game mechanic work as planned? Are the individual instructions clear? Do players like, or dislike, the mini-game?
- How well have we implemented the ‘just downloaded the app’ experience? How does the game flow in those first crucial minutes?
These aspects depend on first impressions and are conflicting with each other. A player, playing for the first time, would have no immediate access to all the games, or the Single Game mode. On the other hand, a person who has played all the mini-games and has had experience with the game would not be a good candidate for testing the ‘first experience with the game’. We needed at least two, differently designed, usability sessions with two different, fresh, sets of players.
1st Session: Testing the Mini-Games
This session focuses on the usability and playability of each mini-game. The first thing we did was to set some tasks for each player:
- The player has to play each mini-game three times: This would ensure that players would have (hopefully) understood the mechanics and would also allow us to see how they perform from total novices to just novices. For example, is there a game mechanic that they seem to understand later, in the third session, that we’d want them to understand immediately?
- The player has to complete a questionnaire for each mini-game: Immediately after playing the mini-game, the player should complete a questionnaire that evaluates their first impressions of the mini-game.
- We should have a closing discussion with the player: Once the session is over, have a discussion about their overall experience.
Each task/requirement had further implications on designing the session.
Because the games are based on cognitive skills, we thought they could be tiring for the player if he’s asked to play each mini-game 3 times and has 11 games to play. We had to ensure that the player will not rate a mini-game negatively because he is somehow mentally tired. In order to be balanced, the order with which games are played changes for each participant. We have:
- Four core games (free games, playable by all, high priority, good first impressions matter the most).
- Three packs of two games each (premium games, playable by those who purchase, medium priority).
- One bonus game that unlocks much later (activated with Tokens, low priority).
It should be stressed that during the session all games are presented as equal to the player. Any categorisation of the games is not revealed, in case that it might somehow affect the player’s perception. The player is simply asked to play the mini-games as they appear in Singe Game Mode.
In reality, each player is assigned a different play order. To quickly switch the game order in Singe Game Mode we implemented an in-game developer’s panel that allowed us to make changes like this rapidly between sessions. We also implemented a quick ‘save/export’ player data feature, which allowed us to quickly save each player’s detailed scores in a database file. Lastly, to make the experience easier for the player, we hid some menu options that were not required for this session, such as the Upgrades section and Challenge modes.
Designing the questionnaire
For obvious reasons, the questionnaire for each mini-game had to be short and easy for the player to complete. We wanted to know if the player was having fun, if she had enough of a challenge, whether she was immersed with the game, if she liked the visuals. Lastly, we wanted to get information from the players on specific usability areas: Did they understand when the level changed? When they answered correctly, did the game make it clear enough? Was the feedback sufficient?
The questionnaire consists of two parts. The first part is a version of iGEQ (the in-Game Experience Questionnaire by FUGA). The iGEQ is a shorter version of the 42-question-long Game Experience Questionnaire. Each question gives weight to different factors, such as competence, sensory and imaginative immersion, flow, tension, challenge and negative/ positive affect.
We translated the GEQ in Greek as per FUGA’s instructions and then adapted the iGEQ to a mini-iGEQ. Instead of 14 questions, two per factor, we used 7; one for each factor. Research-wise, this is not extremely accurate as more questions for the same factor will give greater strength to the results. However, in all documented uses of the iGEQ, the players rarely completed it more than 3 times. In our case, the player had to complete 11 of those, as we needed one for each mini-game. As a result, we chose the most representative questions for each factor (in their Greek equivalent) and used those.
Here's the full GEQ questionnaire translated in Greek along with the questions chosen for iGEQ and mini-iGEQ. Additionally we provide a sample of the resulting questionnaire in English and Greek, for one of the mini-games.
A big part of this was letting the players talk and express their feelings regarding the game. It was also an opportunity for us to ask questions on issues we observed during the session. We left that part pretty much unscripted, apart from having a few set questions such as whether they had any favourite/least favourite mini-games. In retrospect, not having a fixed list of questions to ask the player was not a good approach. Sometimes, questions were asked in a leading way. Additionally, one of our facilitators perhaps inadvertently influenced the discussion by referring to previous players’ performances in the game. For the second usability session we carefully created all the general questions we wanted to ask and had a printout during the discussion.
This marks the end of the second usability article; I hope you’ve found it interesting. In this article I have explained what our game is, how we decided which parts of it we wanted to test, how the players were recruited and how we designed the usability session focusing on the mini-games. The final part will examine the second usability session. We’ll also discuss the lessons we took away from the whole experience and, of course, present some results and design choices that we made due to the usability sessions.