Playtesting Challenges and Strategies from SMU Guildhall's First VR Game, Mouse Playhouse

VR offers unique challenges for UX Researchers on a small dev team. While developing Mouse Playhouse at The Guildhall as a usability team of 1, I developed strategies to make testing more efficient and effective.

My Experience

From September to December 2016, I worked as "Usability Producer" for Tinkertainment, an 18-person capstone project team at The Guildhall. My primary collaborators were:

  • producer: Mario Rodriguez,
  • game designer: Clay Howell
  • lead level designer: Michael Feffer
  • executive producers & faculty: Mark Nausha & Steve Stringer

I was responsible for:

  • planning user research surveys and playtests
  • designing experiment protocols / scripts
  • recruiting and scheduling participants
  • conducting test sessions
  • identifying top UX concern
  • analyzing survey and telemetry data for trends
  • visualizing data
  • drafting reports and presenting findings to the team
  • advocating for user during design meetings


As SMU Guildhall's first VR game, Mouse Playhouse charted new territory for all of the developers involved. We developed the game using Unreal Engine 4 for the HTC Vive. Mouse Playhouse is a light-hearted puzzle game where players guide a pet mice to cheeseby placing blocks to redirect its movement. Players can also play around in a toy area where they can throw objects, play basketball, throw darts, and explore. 

In our playtests, we wanted to capture observations about:

  • Success of onboarding & tutorials
  • Difficulty progression of puzzles
    • perceived difficulty scores
    • time to complete
    • deaths
    • actions
    • teleportations
  • Balance of the number of moves needed to achieve a "star" on each level
  • Conveyance of narrative
  • Comfort of chaperone bounds warning
  • Comfort with height of the puzzle table

We used a combination of the following methodologies:

  • Screening / Demographic Surveys to assess VR experience before test
  • 30 minute playtesting sessions for onboarding the user to the VR control scheme
  • 90 minute playtesting sessions to test the full level progression
  • Gameplay and Marketing Surveys to gauge reaction and difficulty scores
  • Think-aloud Protocal (talk as you go)
  • Heuristics Evaluation
  • Game Telemetry through C++ generated spreadsheets
  • Interviews & Focus Groups

These methodologies are very standard in the game industry, and the feedback generated was key to making our game fun and comfortable within our short development timeframe. Nonetheless, I learned that user research for VR games involves unique challenges for researchers.

VR Challenges for User Research

Limitations on group size

4 considerations limited the number of people that we could playtest at one time:

  • Room-scale VR (HTC Vive) systems are relatively expensive, so we could only afford 5 Vive stations total for development.
  • The size of room-scale is 1.5m x 1.5m at a minimum. This means we could only fit 1 vive in our normal 12 ft by 12 ft lab space.
  • User Research often happened concurrently with development, so testing more than 1 person at a time required me to pull developers out of a Vive. I often had to schedule larger playtests with the producer after major milestones.
  • As a research team of 1, I can only monitor 1 Vive user at a time, and even then, my visuals (from the PC monitor) are limited compared to what the user is seeing in the headset. In a traditional UX lab, I can monitor a row of 3 or 4 computers at one time. 

Small expert user population

Across development, I experienced this tension between having access to novice VR users and needing expert VR users. It was often challenging to gather expert VR users from our convenient testing pool of students, friends, and family. Whereas, we had an over abundance of novice users, and sometimes the novices were below the threshold of our demographic. Novice users are great for testing comfort and tutorials. However, expert users were needed to assess the full level progression of easy, medium, and hard puzzles.

VR Sensitivity

Simply put, every user is different, so every user has a different tolerance to VR. Some people are more prone to motion sickness, dizziness, or claustrophobia than others. This is a handy distribution when finding frame rate drops and comfort issues; however, it can disrupt a full length session and remove a much needed data point if the player has to quit the experience early. This is sometimes an issue for intense PC games, but it is more prevalent in VR.

Novelty Skews Gameplay Reaction

VR is a mindblowing, immersive media that many people have strong reactions to on their first time. It is easier to say "that was awesome" even if a game had inconsistent visual language, bad graphics, or boring gameplay. However, experienced users are quicker to critique games and find problems because they are used to the seratonin-boosting effects of VR. As I mentioned above, I had an abundance of novice users and shortage of expert users. It was hard to know when players truly loved our game content or if players were just reacting to their first time in the Vive. 

Unique Mechanics & Lack of Standards

New players must learn an entire input system to playtest your game, and this takes time. Expert Vive users tend to expect some kind of teleporting or navigation control, along with some kind of grabbing control. However, many Vive games introduce new interactions that players must discover on their own. In Mouse Playhouse levels, we had a radio with buttons for different languages to localize text for Spanish and French language users. However, a new player may base their expectations off of mental models of more pervasive technologies -- smart phones and PCs. As a researcher, I had to add buffer time to my experiments to walk players through Valve's provided Vive tutorial to give each tester a clear baseline of controls. VR games must teach users the "WASD" or "A, B, X, Y" of an entirely new game system.

Lack of consistency breaks immersion

I had one user who assumed that no toys were interactive in the toy room because a train didn't do anything when he played with it. That was one toy in a room of 20+ interactive objects. Visual language, rules, physics, and control mappings -- any schema that players build up -- can deteriorate in an instant if something breaks that consistency. Researchers may need to nudge players if they make a faulty assumption based on a bug or incomplete feature. 

User Fatigue

Although time does pass more quickly in VR, after a certain time threshold, users tend to reach a limit. It is fatiguing to stand up for multiple hours straight while waving your arms around, with a high resolution screen fastened to your face. Typically VR fatigue is faster than PC fatigue. 

Observation & Data Collection

As I hinted earlier, I only see a user's computer monitor, not their actual Vive visuals. Furthermore, I cannot easily monitor users' hands or controller usage while watching the screen. This is somewhat different than with PC game testing, because the player is right next to the viewing terminal. Everything is spread out in roomscale VR testing. This problem is amplified if you have only 1 researcher.


  • Get a dedicated testing station if possible. You will be frequently interrupted if you are limited to using a developer station.
  • Use Standard Tutorials to Normalize the Testers. It is reasonable to give your players a baseline knowledge of the platform before they play your game if you are testing the full content of the game, not just the tutorials. 
  • Give Players a disclaimer about possible dizziness and offer an escape path. This is standard for user research as a practice, but make sure the user feels empowered to stop whenever they want. 
  • Have players take off headset between sessions or questions to limit fatigue.
  • Use talkaloud protocol to understand players initial reaction, but use survey questions that force players to think about difficulty and challenge outside of VR This helps balance out the "new to VR high" without fully negating it.
  • It helps to inquire about player expectations before and after they try something -- don't ask leading questions. Ex: Ask "what do you expect to happen when you pull the trigger" then "is that what you expected to happen?" In this example, you might not directly ask "how would you grab the object" because you're leading them to a schema of the game that they might not have discovered yet. 
  • Listen for mentions of frame rate drops, lag, dizziness, and headaches.
  • Make your user as comfortable as possible.
  • Time box testing to avoid fatigue.
  • Use the Microphone built into the Vive to capture voice data. If you're encouraging "talk aloud protocol"  the player will give you tons of information you'll want to capture.
  • Standardize your data formats so that you can pre-format spreadsheets and scripts. As an example, my game output a spreadsheet for each player. The death count for level 1 was always labelled "Level1_Deaths". When I wrote my R Scripts for data analysis, I could simply plug in a new spreadsheet that contained the 6-8 users from the sprint and run my same script again.

Latest Jobs

IO Interactive

Hybrid (Malmö, Sweden)
Gameplay Director (Project Fantasy)

Arizona State University

Los Angeles, CA, USA
Assistant Professor of XR Technologies

IO Interactive

Hybrid (Copenhagen, Denmark)
Animation Tech Programmer

Purdue University

West Lafayette, IN, USA
Assistant Professor in Game Design and Development
More Jobs   


Explore the
Advertise with
Follow us

Game Developer Job Board

Game Developer


Explore the

Game Developer Job Board

Browse open positions across the game industry or recruit new talent for your studio

Advertise with

Game Developer

Engage game professionals and drive sales using an array of Game Developer media solutions to meet your objectives.

Learn More
Follow us


Follow us @gamedevdotcom to stay up-to-date with the latest news & insider information about events & more