The Evolution of Usability Testing: An Interview with Dana Chisnell

We recently talked with expert usability practitioner, Dana Chisnell. Dana is the co-author of the wonderful book on usability testing, The Handbook of Usability Testing, 2nd Edition. Dana has been conducting usability tests since 1983 and is one of the world’s leading UX experts.

Christine Perfetti: You’ve been conducting usability tests since 1983. In your experience, have you seen that usability tests have evolved over the last decade?

Dana Chisnell: I’d say the technique has evolved quite a bit over the past decade, mainly because technology is ubiquitous now. When I started usability testing, the things we tested were only used by a small number of people.

For example, the first test that I conducted was for the Professional Office System for IBM, a mainframe system that had this new thing called email. It was an interoffice online memo system that allowed users to send messages and update a calendar system. The users of the system were administrative assistants who scheduled and managed communications for their executive managers. It turned out to be an important system used in many executive offices including the White House.

We tested the user interface with administrative assistants knowing that there would only be 10,000-20,000 people who would ever use the system. Plus, IBM would provide all of the users training on the system. That isn’t a lot of users when you think of today’s e-commerce websites. For example, Amazon.com has millions of people visit every day, Plus, Amazon’s interactions are much more complex than many of the systems I evaluated in the 1980s.

In the 1990s, usability testing got a couple of methodological improvements. One was remote usability testing, where we could observe users who were not geographically-colocated and see what they were doing. This was huge because we could now gather data in places where we couldn’t go.

The other thing that happened was that we developed new prototyping methods to evaluate designs, such as paper prototyping. Paper prototypes allowed us to do very early testing of flows, basic navigation, and information architecture in a way that we couldn’t before.

Are more teams using low fidelity prototypes and informal usability tests?

Yes. 80% of my projects with design teams are exploratory studies. The team has a concept, they sketch it out, they create something that is testable, and they go out and test it as quickly and possible.

I still get the impression that many teams aren’t doing much low fidelity prototyping and evaluation. But I’m seeing it’s starting to get more popular and practical as more development teams are moving to Agile approaches.

Are you seeing a big increase in the amount of usability testing happening within organizations?

More and more companies are doing usability testing. But I see over and over again that there’s a gradual evolution in their thinking.

Many teams first start testing at the end of the design process. This is what typically happens: A company discovers that they made some mistake when launching a design, someone on the team discovers usability testing, and they go try usability tests on their next project. This helps them realize they know very little about how their users behave.

When they realize this, the team backs up to the beginning and they try conducting some field research or ethnographic research to learn more about their users and their context. Next, they start filling in their process by conducting some exploratory tests at the beginning of the design, such as proof of concept or rapid iterative testing. Finally, they begin to conduct testing all the way through the process and summative tests right before they launch.

This evolution can take between one and ten years. But it does happen. I see that lots of different kinds of companies are figuring out that they need to do usability testing. I’m astonished that they’re just figuring it out.

You mention that many teams start testing once they’ve experienced a design failure. But despite some failures, many teams still struggle to convince stakeholders that testing is a valuable approach. How do you go about helping teams get buy-in for their tests?

Usability testing has evolved in another way over the last ten years. It’s gone from being a method for only identifying design problems and eliminating frustration to a way to gather information about users and to get data to inform design decisions.

That is my pitch to organizations. Look hard at how you’re making design decisions. Without doing some kind of user research or usability testing, how do you know you’re basing your design decisions on good information?

Yes, if you have very experienced designers on your team who really know your users and know what it’s like in context for users, you can probably do good design. Some companies successfully design this way without conducting usability tests.

However, if you’re like most of us and don’t know everything about your users, gathering data through usability testing can give you real insights about what it’s like for them. That really impacts the design.

One concern I hear from teams is that usability testing will take huge amounts of time and resources. You’ve been working with teams to help them conduct usability tests on a shoestring. How does that quick-and-dirty testing work?

My position on what I call, “testing in the wild,” is for teams to conduct ad-hoc testing or data gathering – formally or informally – whenever they can.

If the team only has one hour to observe one user, that’s better than making it up. I recommend doing as much observation of real people that fits into your budget and resource constraints. Any research will give you more confidence in the design decisions you’re going to make.

In the mid 1990s, Jakob Nielsen came up with the idea of discount usability testing. He has since stepped away from the modern interpretation of this technique. However, the approach involves conducting usability testing with a few users iteratively. Over time, you’ll be amassing data about your users, but you can still make design decisions in the interim without waiting. In my experience, if design teams allot 1-2 hours every couple of weeks to observe users, they are going to be way ahead of the game.

There’s still a lot of discussion in the field about how many users are necessary for testing. What’s your take on this never ending debate?

My take is that it’s a very complicated issue. When Bob Virzi and others conducted research in the 1980s on the number of users, it was found that five to eight users was typically enough for testing. But the field has changed since the research was conducted.

Since these initial studies, there are many differences. Many more people are using the same application for different task goals. Also, the audiences today are much larger than they were in the 1980s. Interfaces are just much more complex because many of the UIs we use are embedded in other things.

For example, how do you conduct a usability test with only 5-8 people on an iPhone with 25 applications on it? We’re talking about a highly customizable mobile device where the individual experience is very different from one person to the next. This wasn’t true in the early studies on the number of users.

I tell teams to conduct enough sessions to give them confidence in what they are learning so they can go forward with a design direction. Keep asking questions and do research to answer your questions. It could be that 5-8 participants will be enough initially to answer your existing questions. But your work isn’t done there. Every time you’re making new design decisions, you’ll come up with new questions. That means you’ll want to test more users.

What some people may not know about you is that you’ve focused much of your research agenda on election ballot design. Can you tell us about your work in this area?

This is a really interesting area of design. Up until the last few years, sociologists and political scientists conducted all of the research on election and ballot design. They had never observed a human being using an election ballot. They just surveyed them or sent them questionnaires.

This all came to the forefront because of the 2000 election for U.S. President and the butterfly ballot. This was a design problem. The person who designed the ballot for Miami-Dade County in Florida wanted to create a more usable ballot for her constituents.

Because many of the constituents in Miami-Dade were older, she wanted to increase the type size. By increasing the type size, the candidates for the office of President flowed over two pages and created the butterfly effect. This caused alignment problems, so people voted for people they didn’t intend to vote for.

This turns out to have been a big problem for a really long time. It’s hard for voters to vote the way they intend. So, I’ve been working with the Usability Professionals’ Association, AIGA, the Elections Assistance Commission, the National Institute of Standards and Technology, and the NYU Brennan Center for Justice to communicate best practices for ballot design to local officials and the world.

We’ve made progress. I’ve collaborated with AIGA to design specifications for the Elections Assistance Commission and I’ve worked with the Usability Professionals’ Association to create a simple kit that anyone can use to conduct a usability test of a ballot. It’s now much more likely that voters will vote the way they intend.

Thanks so much for your time, Dana.