Home Home | About Us | Sitemap | Contact  
  • Info For
  • Professionals
  • Students
  • Educators
  • Media
  • Search
    Powered By Google

Traveling in Cyberspace:
Psychology of Software Design, Part II
Usability Evaluation

J. Philip Craiger
University of Nebraska at Omaha

In my last column, I presented a description of the software1 design process and showed how psychology plays an important role in building successful interactive software. In this column I will discuss the second part of the design process, usability evaluation. Usability evaluation is a form of testing that is applied to the design of computer software, in particular, the interface with which users interact. Essentially, it allows the design team to determine the extent to which an interface will support users in doing whatever they need to do (whether work or play).

1I use the term "software" and "interface" interchangeably throughout this column. As I discussed in my last column, the interface is the part of the software with which the user interacts, and therefore, to the user, the interface IS the software.

The terms "usability" and "user-friendly"a term I assume most of you have heardare loosely interchangeable. Usability has been defined more precisely as:

the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use. (Karat 1997, p. 691).

There are numerous usability evaluation methods, and they differ in terms of cost of implementation, who works on the evaluation team (actual users, usability experts, or a combination thereof), the effectiveness in finding usability problems, and so on. Simpler methods are called discount usability methods because they are low-cost and relatively easy to use. At the opposite end of the complexity and effort spectrums is user testing. Full-blown user testing involves real end users of the software performing real tasks with a fairly complete and functional high-fidelity software system in a laboratory setting.

Why is usability important? Software that is difficult to use has many ramifications ranging from personal dissatisfaction and lower productivity to increased training costs. "The cost of less-than-user-friendly software can be astonishingly highthe combined result of unnecessarily high training and customer support costs, unnecessarily low productivity, and lost market share" (Mayhew, 1999, p. x).

Following the user-centered design principles I described in my last column does not guarantee a product's usability. Rather, one might say that user-centered design is a necessary but insufficient condition for usability. Interface design is not a one-shot deal. It requires numerous iterations to get it right (i.e., to ensure a software product's usability). It may be helpful to think of interface design like the process of writing. The author can start out with an understanding of the intended audience, a clear theme, an outline, thorough background research, and so forth, but that doesn't guarantee that your first draft will be perfect. It usually requires numerous iterations of writing, reading, revising, reading, revising, and so forth. This is analogous to software design: develop the concept, generate the design, evaluate it, redesign, evaluate, redesign, and so on, until it is right.

Aspects of Usability

Usability is actually a multidimensional construct whose meaning is derived from the aspects of the software that affects it purpose and use. That is, criteria used to evaluate software usability will vary depending upon who will use the system, and the characteristics of the tasks for which the system is used. The most common usability criteria (Nielsen, 1993; 1994) include:

Productivity and efficiency: Software should be designed so that users can perform tasks quickly and efficiently.

Minimize errors: Software should be designed to prevent errors.

User control: Software should allow users the freedom and control to complete a task as they see fit. It should not force users to follow one path through a task.

Ease of learning: Software should be easy to learn. If software is used infrequently, it should be easy to remember how to use it

User satisfaction: Users should enjoy using the software to do their job.

In the best of all worlds, all software would reflect each of these criteria: be easy to learn; promote fast and efficient work; minimize errors; allow users flexibility in sequencing tasks; and achieve a high level of satisfaction with all users. This, of course, is not a realistic situation. Designing to achieve one criteria may have an adverse effect on another usability criteria. To illustrate, say you have two usability goals: to maximize ease of learning and to provide user control and flexibility. Making the software easy to learn may require limiting the number of choices, such as menus, buttons, and so forth, available to a user (otherwise, it would be too confusing for novice users). Or, to make it easy to learn, the design could force users to complete steps in a strictly linear fashion (e.g., the typical "wizards" you encounter if you have ever installed any software). These design decisions have the effect of limiting the flexibility that a user has in completing the task. Certainly, expert users would like more choices, as well as the freedom to complete the same task in different ways (to alleviate boredom, to accommodate their personal work style or mood, and so on). Notice that the reverse may hold. Providing the user the freedom to accomplish a task in multiple and varied ways may make the software more difficult to learn.

Note also that there are both positive and negative relationships among the usability criteria. For instance, we expect positive relationships between user satisfaction and the remaining criteria (easy-to-learn, fast, and flexible software is more satisfying to users). Other criteria, clearly, are negatively associated. The more errors that are committed by a human, the less efficient and productive he or she is.

The usability criteria are not selected arbitrarily by the designers. Rather, the selection of these criteria is determined by the types of users who will be using the software, and the task for which they are using the software. For example, if most users are novices (due to high turnover, or seasonal work), and reduced training costs are important, then ease of learning may be the most critical criteria. For telephone assistant operators, speed of execution and minimal errors would be important. For a child's computer game, ease of learning may be more important than speed of execution and efficiency.

Now let us turn to a description of three usability evaluation methods. I will start with heuristic evaluation, a fairly simple discount usability method.

Heuristic Evaluation

Heuristic evaluation is a method of evaluating an interface based on several simple heuristics. It involves usability experts providing independent evaluations of a prototype to identify potential violations of the heuristics. Heuristic evaluation is considered a discount method because it is relatively inexpensive, for example, it typically does not involve actual end users of the software, nor necessarily consider the full range of actual tasks that users will perform.

Nielsen (1994) identified 10 heuristics that account for the majority of usability problems. For example, heuristics of good design include:

Visibility of system status: The software should inform users about what is happening through appropriate feedback.

Match between system and the real world: Language and concepts should be familiar to the user. Information should appear in a natural and logical order.

Consistency and standards: Users should not have to wonder whether different words, situations, or actions mean the same thing. Platform conventions should be followed.

Error prevention: The design should prevent errors from occurring.

Recognition rather than recall: Objects and actions should be visible and apparent to the user. Users should not have to recall information when it could be provided by the software.

The reader is referred to Nielsen (1994) for a description of the remaining heuristics. A heuristic evaluation is conducted by a group of evaluators, working independently, applying the set of heuristics to a prototype. These independent evaluations are aggregated, and the violations which occur most frequently across the evaluations indicate a problem with the design.

Figures 1 and 2 below illustrate examples of bad and good design, respectively. The graphic is a simple login screen (I assume that most of you have had to login to a computer system at least once). I have applied the heuristics to the first screen (Figure 1), and below I've listed the violations of the heuristics. Figure 2 illustrates an alternative design which has corrected the violations.

Figure 1. A login screen with several design violations

The violations are as follows:

The sequence of input fieldsbased on a vertical task flowdoes not match a traditional login sequence. Users are typically asked for their username first, and then their password. Here we have the reverse, which will undoubtedly cause problems. For example, more experienced users may not pay attention to the labelsbased on their expectations of the sequenceand inappropriately type their username in the password field. Heuristics violated: Match between system and real world, consistency and standards, prevent errors.

Improper labeling of the "username" field. Novice users may be confused and may type in their real name ("Filo J. Farnsworth") instead of their username ("ffarnsworth"). Heuristics violated: Consistency and standards, and prevent errors.

There is no visible way to complete the sequence of the login. That is, after completing the two fields, what does the user actually do to login? Typically, a button is provided which the users presses to complete the login procedure. Although not obviousbecause it isn't visiblethe user has to type a special key (e.g., F7) to complete the procedure. Novice users would have difficulty knowing what to do. Moreover, because this is a nonstandard way to complete the sequence, intermittent users (e.g., someone using the software once a month) would probably forget what the special key is. Heuristics violated: Recognition over recall, and consistency and standards.

Figure 2 demonstrates an alternative design that corrects the aforementioned violations.

Figure 2. A login screen with violations corrected.

Cognitive Walkthrough

The cognitive walkthrough procedure was developed to evaluate the learnability of a software system, and allows designers to answer the question: "How easy would it be for a particular set of users to learn this software?" Similar to the heuristic evaluation, and unlike user testing, actual end users are not required to conduct a walkthrough. Rather, designers, usability specialists, software engineers, developers, and the like, are used as surrogates for users.

Two essential components of the walkthrough include a prototype of the system, and an explicit understanding of the intended users and any characteristics about prior knowledge and training, experience with similar software, or other assumptions that would effect the user's ability to learn the software (Lewis & Rieman, 1993).

The walkthrough proceeds by examining each step the user would take to accomplish a task, and trying to tell a believable story as to why the users would choose a particular action (Wharton, Rieman, Lewis, & Polson, 1994). For example, if the first step in a particular task requires the user to press a button labeled "update," then a believable story would explain why a user would be likely to accomplish that step. Note that believable stories are based on assumptions about the user's background knowledge and what they are trying to accomplish, and on an understanding of the elements of the software that would enable a user to determine the appropriate action, and is why it is critical that these assumptions be expressed explicitly at the outset.

As described in Wharton et al. (1994), for each step required to accomplish a task, the evaluators ask the following four questions:

Will the users understand that there may be a subgoal to complete before they begin? For instance, if the user's task is to print a document, will the user know that they must select a printer first?

Will the user notice that the correct action is available? That is, is there something that is visible to the user, such as a button, a menu, a text field, that gives the user a clue as to what to do next?

Will the user associate the correct action with the effect to be achieved? This question speaks to the association between the goal of the user, and the visible parts of the interface. For example, in Figure 1 above, would the intended users know that they had to select the F7 key to complete the login process? How would the user know that they were required to press the F7 key to complete the login? They wouldn't, so the answer to this question would be "no." This would indicate a usability problem, and the software would need to be revised such that the evaluation team could answer "yes" to this question (e.g., by modifying the screen to resemble Figure 2).

If the correct action is performed, will the user see that progress is being made toward their goal? This question relates to feedback provided by the software. For example, if the user presses a button to print a document, would there be some type of feedback that would let the user know that the document is being printed?

At each step of the sequence required to perform a task, the evaluation team will try to construct a believable story that defines success (i.e., answering "yes" to each the questions above). If any question receives an answer of "no," then the evaluators found a problem with the software, and an alternative design should be considered.

User Testing

The most comprehensive evaluation technique is user testing. User testing involves a set of actual end users conducting a set of real tasks using a high-fidelity and functional prototype in an laboratory setting. It is the most costly and effortful usability evaluation method, however, it is also the method most likely to identify the most comprehensive list of usability problems. Because of its expense, user testing is often limited to larger companies with the money and experienced personnel to conduct such a test. Even then user testing can be delayed to the end of design process because of the need to have a fairly complete and functional prototype.

User testing involves actual users working through tasks on the prototype while members of the evaluation team record information relevant to usability, such as errors made, time to complete a task, frequency of errors, and so on. These situations may be video- or audio-taped for analysis at a later time. A technique called thinking aloud is often used to gather additional feedback. Here users are asked to "think out loud" regarding what they are doing or thinking as they are working through a task. Data from thinking aloud provides valuable information that designers would otherwise be unable to gather, such as the users' goals and intentions as they are working through a task, and reasons why they chose a certain option or course of action.


Finally, the actions that users perform can be captured by the computer and saved to a file for examination by the evaluation team after the user test. Logging users' actions provides very detailed information on not only what the users did during the task (what buttons they pushed, menus they accessed, etc.), but also how long it took them to complete various aspects of the task.

Conclusion

The three evaluation methods described above are not necessarily mutually exclusive. For example, a design team may apply the cognitive walkthrough procedure very early in the design process, and once a more functional prototype is available, move to user testing. Or a design team may decide to use all three methods at one point or another. Either way, usability evaluation is essential in determining the usefulness of interactive software products, and some form of evaluation is better than none.

Computer systems are playing an increasingly important role in our lives. More and more jobs require some computer usage. Many of our home appliances, once purely mechanical, are now driven by a tiny computer chip. Although some computer systems are "embedded" (i.e., not for direct human use), most computer systems are made for direct human interaction. Consequently, it is important that humans are able to use the system to do what they need to do, and usability evaluation plays a key role in determining a system's usefulness. For those of you who are interested in reading more about usability evaluation, I've provided some additional references below.

Further Reading

Hix, D., & Hartson, H. R. (1993). Developing user interfaces: Ensuring usability through product and process. New York: John Wiley & Sons.

Karat, J. (1997) "User-centered software evaluation methods." In M. Helander, T. K. Landauer, and P. V. Prabhu (Eds.), Handbook of human-computer interaction. Amsterdam: North-Holland.

Lewis, C., & Rieman, J. (1993). Task-Centered user interface design: A practical introduction.. A shareware book published by the authors. Original files for the book are available by FTP from ftp.cs.colorado.edu.

Mayhew, D. (1999). The usability engineering lifecycle: A practitioner's handbook for user interface design. San Francisco: Morgan Kaufmann.

Nielsen, J. (1993) Usability engineering. Boston, MA: Academic Press.

Nielsen, J. (1994) "Heuristic evaluation" In J. Nielsen and R. Mack (eds) Usability inspection methods. New York: John Wiley and Sons.

Nielsen, J., & Mack, R. (1994.) Usability inspection methods. New York: Wiley & Sons.

Preece, J., Benyon, D., Davies, G., & Keller, L. (1993) A guide to usability: Human factors in computing. New York: Addison Wesley.

Rubin, J. (1994) Handbook of usability testing: How to plan, design, and conduct effective tests. New York: Wiley Technical Communication Library.

Spoor, J. M., Scanlon, Schroeder, W. T., Snyder, C., & DeAngelo, T. (1998). Web site usability: A designer's guide. San Francisco: Morgan Kaufmann.

Wharton, C., Rieman, J., Lewis, C., & Polson, p. (1994). The cognitive walkthrough: A practitioner's guide. In J. Nielsen and R. L. Mack (Eds.), Usability inspection methods. John Wiley and Sons, Inc.

 


January 2000 Table of Contents | TIP Home | SIOP Home