We have an incredible opportunity in front of us; the folks who will be using our ASK app are coming through the door every day, so we can test directly with our audience. As a formalized process, this is pretty new to us, but it's something we've really taken to heart on Bloomberg Connects. This post is going to address how we've been running testing sessions, logistics issues, and lessons learned the hard way. As we've moved through testing stages, we've implemented two types of user testing—formalized and ad-hoc sessions. Formalized sessions (see sample logistics sheet) are extremely complex affairs which help us conduct end-to-end testing of the entire app. These sessions, scheduled on Thursday nights when we are open late, last between 1-3 hours. Testers are invited ahead of time and the entire team (web, interpretation, curatorial) is on site to run things. Ad-hoc testing is used to test small UX changes; it takes place during the day, two members of the web team solicit help from visitors at random. Staff spend about an hour during ad-hoc sessions seeking as many visitors they can to look at, usually, just one thing.
Recruiting users is always a challenging task. In both types of testing, we aim to get people who represent the entire scope of our audience's diversity and demographics, so we've learned to broaden our net when recruiting. For formalized testing, we have sought testers from our museum members, those subscribed through our e-news, social media followers, volunteer lists, friends and family. We've also learned that we need to run testing sessions at different times of day because different constituencies will want to come to the museum at different times. We've adopted airline strategy and if we need 10 people, we schedule 15. Finding users for ad-hoc testing is quite different. We are approaching people at random and need to catch potential testers at the right moment because we don't want to interrupt their museum visit at the wrong time. We've discovered that working with people in lines (cafe) or interstitial spaces are good locations. Internally, we've dubbed ad-hoc testing "somewhat socially awkward user testing" because—and if you've ever done this you know—it's pretty hard to go up to people directly and ask for help at just the right time.
In both types of testing, we now have a "set the script" rule. We've found it's incredibly important for consistency that every team member introduces the app and/or the "ask" for testing exactly the same. We've learned the hard way that subtle shifts in the script can change how a tester interacts with the app. Setting the script and following it closely helps us get consistent results. It's also important to decide exactly how much information you want users to know before they see the app. This is a tough one, though, because for formalized testing people need more motivation to come spend the time with us and they want some idea of what they will be doing. In general, we aim for fresh eyes (people who have not seen the app before) combined with as little information as possible about its use. In formalized testing, we've had to give a little more info. In ad-hoc testing, we can often go in cold and get a good feel for someone looking at things from a zero start point.
Throughout testing, we've found observation combined with exit interviews is key because we want to ensure that what people tell us about the app matches what we see them doing with the app. As such, in formalized sessions users are trailed (either individually or with one staffer covering a gallery area for all testers walking through it) and then interviewed after. Whenever possible, we've tried to keep exit interviews separate from tester to tester to avoid information cascades and influencing among participants. On that note, staff running formalized testing sessions have been asked to hold their notes and write things down individually, so we can compare notes with minimal influence among ourselves. Additionally, during ad-hoc testing, we've found it useful for two staffers to work together because it helps to have two eyes observing these smaller interactions, so we can ensure we are not basing change on a single staffer's interpretation.
Lastly, and this may seem obvious, but it's here because it wasn't so clear to us during the first go round—stagger your start times. During formal testing when you have 10-60 testers, it's incredibly important for all involved to stagger start times so you don't get overwhelmed. In one example we staggered 20 people per hour for three hours, but found that even more staggering within the hour could have helped us considerably. In reality, we would have done better with 5 testers every 15 minutes for three hours.
Do we seem organized? I assure you this post is the result of countless follies, "duh," and "we have to blog this" moments. It feels like every time we run a user testing session we learn something new. My last piece of learning, for now, is that no matter how much you plan, something will always go wrong and sometimes you just have to roll with it.
We've got a few upcoming posts about user testing. We'll be talking about specific things we've learned and the changes we've made accordingly. We're also going to talk about fighting user expectations and other technical issue gotchas. Stay tuned.