About the game: The Irregular is a virtual reality mystery-solving experience based around critical thinking and teamwork. Players, or a group of players, will work to gather clues, answer questions, and solve the mystery. The Irregular has been selected as a finalist for the Ed Sim Challenge, a competition funded by the US Department of Education seeking next generation educational simulations for high school students that strengthen career and technical skills.
The Irregular’s first mystery, “The Watermelon Problem” is the showcase for our virtual reality mystery experience – it needs to be perfect. As we’ve never built complex puzzles like this before, or a mystery game for that matter, we knew that running thorough playtesting with real players using our prototype would help us reveal any breakdowns in the game design and flow. We took the step of crafting a paper prototype of the game and playtesting through the complete mystery on paper before moving into production. We’d like to share our paper playtesting process and learnings with devs especially those making mystery, puzzle or logic games, including those with high production values.
About the author: Lanie Dixon is the resident Games UX Researcher at Octothorpe in Salt Lake City, Utah.
Identifying Our Most Important Questions
At Octothorpe we’re a player-centric studio, embedding player feedback as early as possible and throughout development – but playtesting before writing any code was a new challenge. We sat down as a team and compiled our core concerns, in order to determine the best means of early prototype testing. Because we were designing a mystery, we knew there would be intentional loose ends and grey areas; we needed to ensure that players were not getting stuck in those places indefinitely. Getting the difficulty and logic of the underlying mystery sorted first would allow us more focused time with the VR interaction design challenges later in development. To get a better understanding of the foundation of the mystery we decided to test with the following questions for our prototype test to answer:
- Question 1: Are there any major breaks in the logic of the mystery?
When designing the mystery it is difficult to anticipate how players will understand, recall and utilize information in each ‘clue’. The team carefully designed each clue’s narrative relationship, ensuring they made logical sense by mapping out the ‘intended’ logic path and all necessary logical connections.
Using the paper prototype we were able to print out the clues and ensure that the correct clue connections were present where we intended.
In the research sessions we needed to understand how players actually worked through the clues and hints, one-by-one, so I resolved to take notes on playtesters’ use of every clue in real time during the playtesting. When players answered each question of the mystery I asked them to point out the clues they used to make their conclusions (“How do you know?”), so we could discover breaks in the mystery logic.
- Question 2: How is the overall difficulty of the game?
We needed to check the difficulty of the mystery, in all its forms. With many ways to solve the mystery we needed to ensure that players could complete it without undesirable frustration. Data on the difficulty comes from the player’s opinion and their behavior. To do this I captured data using time spent per question, writing down which clues they used either correctly or incorrectly, and asking players after their session how they perceived the difficulty was in a few different ways.
- Question 3: Do players understand the larger picture of the story?
Making sure that players can understand the narrative at each step of the puzzle is important for eliminating unintentional confusion. Because the design of mystery games inherently fall into a confusion grey area, we knew this would be a pivotal challenge to address. At points the mystery was intentionally vague, and at others knowing the narrative was instrumental to progressing to the next question. The team was curious to know how well players grasped the overall story arc. To capture this I would ask players to recount key narrative plot points and to briefly recount the whole mystery narrative after the session.
A Mystery on Paper
Reducing costly iteration while we explored the mystery design was the core objective of this paper prototype playtesting. But to answer the team’s questions about the mystery, it needed to be a complete version. The team created a completely printable paper version of the full mystery – complete with written and pictorial clues. Prototyping everything on paper rather than digitally meant the design team only needed to create a few 2D assets for the test, easily and quickly created in Photoshop. The production cost of these disposable 2D assets was next-to-nothing compared with the alternative of having high-res 3D assets on the cutting room floor, had we implemented first and playtested later in development. We hoped that fine-tuning the mystery on paper would allow us to shift focus onto VR interaction design and other significant challenges of VR earlier.
The original document which we created to describe each of the clues as well as lay out the questions for players.
Test Procedure
For the mystery I knew it would be important to “get into the player’s head” during the play session, since we anticipated that a lot of play time would be players mentally reviewing clues. The issue was how do we get this information from people- What are their thought processes as they are reviewing the clues? In the past I’ve had mixed success in encouraging players to verbalize their thoughts during playtests. To overcome these challenges, we came up a test plan that put me in the role of a ‘Dungeon Master’ of sorts, which we hoped would maximize the possibility of players actively communicating with me, but minimize my interference in their thought processes. We already had plans for an NPC character that would work with the player in the game so, though it would be highly unconventional, I decided that the test would involve me role-playing as this NPC by reading his script. As the NPC was already a part of the design we didn’t need much further development to craft a complete script for me to use to answer player’s questions and provide feedback on their answers to each question.
We hoped that through engaging with the players by role-playing as the NPC they would feel more comfortable with me sitting next to them and it would help them feel more at ease engaging in unprompted conversation with me. This would be highly risky, as there would be potential for me to influence players and thereby alter their experience – tainting the playtest data – since we can only validate the players understanding if each player has the same game experience. Moderating playtests can be very challenging, and poorly moderated sessions can potentially be damaging to playtest data. Keeping in mind these challenges going into our planning helped us be mindful of the day of processes for each playtest.
For each player, we laid the printed paper clues out in the same positions then allowed them to move them around and
organize them however they wanted.
Because we were doing things a bit differently, it was important for players to understand my larger role in their game. Through an introduction prior to starting, I explained that I would be interacting with them throughout their session and would provide question prompts as well as hints and clues for correct and incorrect question responses. I encouraged players to ask me questions at any point but noted that I may not be able to answer their questions or help them every time. In addition to the clues we also provided players with blank paper and a pen to take notes should they chose.
What We Learned
We learned a huge amount from the paper prototype testing. Not only did we gain valuable insights about the mystery design, we also gained some insight into running future mystery testing and paper prototype research.
Build in Loads of Test Prep Time
As I’ve mentioned before, we are a small studio with limited resources and budget. Because we knew we wanted to be able to do as many player tests as possible, we had to maximize each opportunity. Once the design team was comfortable with the prototype we planned 3 days to prepare the test procedures – three times longer than a typical playtest takes to prep. We allocated ourselves 2 days to review the clue sets and brainstorm about potential issues, and on the third day we discussed and finalized the developer questions. It was important that I took the time to consider the team’s questions so that I could come up with the test procedure to gather that information rigorously. Additionally, since I was adding additional complications by playing the NPC, I needed time to prepare the script. If you’re considering running a similar type of session, ensure you’re building in extra prep time, moreso if you’re writing and rehearsing the script prior to the playtest, which you should plan on doing.
Role-playing as an NPC Paid Off
It was important that each player received a similar experience of the game and I wanted to ensure that I went off script as little as possible. To help, the design team created a master guide which indicated the script and questions, correct clue responses to each of the questions, as well as hints for incomplete or incorrect responses. I also had planned out other questions such as “What clues do you think we should use to solve this?” as well as prompts to ask when players sat inactive. To handle invalid responses I wrote down canned responses such as “I can’t help you with that right now” or “Maybe you should keep looking” to give myself standard answers incase instances came up that the guide wasn’t prepared for.
When asked which clues they used to answer a given question players typically would pick up the clue and point out which
information they felt was relevant and explain what they understood the answer to be.
We took a huge risk of giving me an active role in the player’s game session, but our careful scripting and planning did pay off. My playing the NPC did actually seem to help players actively engage unprompted with me. Players frequently referred to me by the NPC name and initiated conversation with me throughout their session, most often without prompting. Knowing when and how to interact with players during a playtest takes practice, writing down canned responses and having a script is a must.
Taking Detailed, Timestamped moderator notes
An important part of moderating a playtest is taking notes during the session. I knew that at times things would progress quickly during the game, and because I would need to be taking notes as well as reading off my script I created a spreadsheet beforehand with conditional formatting to quickly color code my note entries (purple for my observation notes, yellow for direct reports or comments from the player). I opted to do auto fill timestamps on each note row so I could calculate afterwards how much time players were spending on each phase of the mystery, rather than juggling clocks and timers throughout the session. During the session I noted comments and direct reports from the players (e.g. “I don’t know what to do here”) as well as observations I made (e.g. multiple players were observed, unexpectedly, making timelines using dates from the clues), and I noted when questions were given as well as each response (correct and incorrect) players gave. It turns out that the timestamps were incredibly valuable, more so than a typical playtest. From each timestamped note we were easily able to calculate how long it was taking players to correctly answer each question. Additionally, the notes on the players correct and incorrect responses helped create a more detailed picture for the team about how players were working through the mystery.
Keeping up with the quick pacing of the session I had to type as quickly as possible while still being able to decipher the notes
later. Color coding each entry meant it was easier for me to go back and find the notes I needed.
In the end we were able to discover that regardless of which play strategy players took to complete the mystery they had very similar total play times and similar difficulty ratings. From the notes we saw that those players that spent more time reviewing clues in the initial clue review phase spent much less time answering questions than those that spent very little time reviewing and went straight into questions. This was helpful for us since we didn’t want to prescribe players how to experience the game, but we were still interested in estimates for total play time.
Giving players paper and pen to write notes
Even though I planned to have an active role in the play session to encourage players to communicate with me, I also provided them with a paper and pen to take notes. Our hope in doing this was to get additional insights into players strategies as well as hopefully be an aid for players answering questions. What we found from reviewing the notes from the players is that they were making mostly accurate connections and were gathering the clues as we had hoped. During the post-game interview players pointed to their own notes on multiple occasions and stated they ‘felt smart’ that they had noted important details early on. By allowing players to write down notes we hoped this would lessen any memory overload and allow them to have less frustrating experiences trying to remember multiple details. The goal with providing the paper to them was that if they had a natural tendency to take notes we allowed them to do so.
Going back and reviewing notes written by the players was helpful in begin to understand whether players were making the correct connections from the clues.
We knew that we wanted players to have some ability to “take notes” in the digital game. Seeing that players were taking such detailed notes during the playtest justified the need for the mechanic. By allowing us to review what players behaviors looked like when they were writing notes we were able to shape the design of the future in game note taking mechanic prior to committing anything to code.
Bottom Line: Playtesting with a paper prototype helped cut time and costs
Initially we were unsure about how paper prototyping would work with a complex mystery design, but this testing has undoubtedly cut down time spent in the future on costly iterations. By using the original design document our design team created we were able to run a paper prototype test with minimal modifications to the original document, which meant we got to testing with real players quicker. Having the entire design on paper meant that issues that did come up were more easily remedied on paper rather than digging into the code.
Following the test we were able to identify important adjustments needing to be made to key clues. In all, 5 clues needed updates: some only needed minor font and/or terminology changes, but others needed info added or removed which changed their overall appearance enough that they had to be remade. Additionally, one entirely new clue was created and added to help with the understanding issues that came up in later game during the playtests. All of these changes were made prior to having the 3D assets made, which saved us several thousand dollars.
Following the test we discovered our font choice for ‘I’ in ‘I. Collins’ was continually being confused as a ‘T’ so we changed the font to be more distinguishable. We also discovered that players were frequently confusing what the difference between credits and debits were on the bank ledger.
During testing, we discovered that a map was needed. Many of the clues referred to cities but for those less familiar with
London and Wales did not understand the importance of their different geographical locations.
Building a game in VR has its own sets of challenges so ensuring our mystery was solid before implementing it into VR meant we can spend more time and resources ensuring the VR interactions are where we want them to be. We now have fundamental research and test procedures we can use and follow for new paper prototype mysteries in the future. In the end we learned that even a complex mystery game can be playtested entirely on paper with minimal overhead cost and with very little production time. This all resulted in quicker validation of our design decisions from real players which meant we didn’t have to only hope that the mystery would be understandable once it was in VR. The test results allowed us to make the necessary adjustments to the design as well as informed our design & VR implementation process at a very critical early stage with low overhead.