CHI '95 ProceedingsTopIndexes
PostersTOC

Validating an Extension to Participatory Heuristic Evaluation: Quality of Work and Quality of Work Life

Michael J. Muller, Anne McClard, Brigham Bell, Scott Dooley, Lori Meiskey, Judith A. Meskill, Randall Sparks, and Donna Tellam

U S WEST Technologies
4001 Discovery Drive
Boulder CO 80303 US
+1-303-541-6564 (voice) +1-303-541-8182 (fax)
michael@advtech.uswest.com or muller.chi@xerox.com

© ACM

Abstract:

We describe an extension and validation of Nielsen's heuristic evaluation approach, to include "humanistic" aspects of systems. Three additional heuristics addressed quality of work product, quality of work life, and respect for users' skills. In a participatory heuristic evaluation of an intelligent tutoring system, the three new heuristics performed comparably to earlier sets of heuristics.

Keywords:

Heuristic evaluation, Usability, Participatory design, Participatory assessment, quality of worklife, skill, quality

Introduction

Heuristic evaluation was developed by Nielsen and Molich [6] as a "discount" usability tool, through which practitioners can achieve usable and useful results of usability evaluations with relatively little time and resources. Using a base set of nine [6] or ten [5] guidelines (or heuristics) regarding common user interface problems, experts conduct a free exploration or a task-based walkthrough of a user interface, verbalizing usability problems as they encounter them. The concept of "experts" may include software professionals, human factors professionals, or work domain incumbents (users) - a form of participatory usability assessment. Recently, Nielsen revised his list of heuristics as a union of seven published sets of usability guidelines [4].

EXTENSIONS TO HEURISTIC EVALUATION

In the language of Floyd [2], Nielsen's heuristics appeared to us to be relatively product-oriented - that is, they have assessed the system as a relatively self-contained object, without strong contextualization in conditions of use. We hoped to extend Nielsen's approach in a more process- oriented [2] direction, emphasizing the fit of the system to the user and to her/his work needs.

In order to evaluate our extensions, we needed a benchmark. We therefore began with the ten heuristics of [5], whose usefulness has been studied and validated (e.g., [3,5]):
H1. Simple and natural dialogue
H2. Speak the user's language
H3. Minimize memory load
H4. Be consistent
H5. Provide feedback
H6. Provide clearly marked exits
H7. Provide shortcuts
H8. Provide good error messages
H9. Prevent errors
H10. Maintain user control of the system

To these, we added three new heuristics:
H11. Respect the user and her/his skills
H12. Pleasurable experience with the system
H13. Support quality work

We applied these 13 heuristics in a heuristic evaluation of the Learn, Explore And Practice (LEAP) intelligent tutoring system [1], which supports supplemental training of skilled telephone company service representatives. Five human factors experts and three work domain experts (users) participated as evaluators.

RESULTS

After removal of redundancies, our evaluation revealed 247 usability problems, resulting in 89 recommendations to the development team, of which the team accepted 87 percent and implemented 72 percent. Each problem or recom- mendation was scored by the human factors member of the team as being related to one or more of the 13 heuristics. Percentages of Problems and Recommendations. For the purposes of this poster, we compare the independent (i.e., unique) contributions of the new set of heuristics (H11-H13), with the original set of ten heuristics from 1992 [5] (H1-H10). Figure 1 summarizes the percentages of (a) usability problems and (b) usability recommendations based on the different sets of heuristics. One or more of the heuristics from the set H1-H10 accounted for 33 percent of problems and 31 percent of recommendations, without any contributions from the set H11-H13. By contrast, one or more of the heuristics from the set H11-H13 accounted for 15 percent of problems and 10 percent of recommendations, without any contributions from the set H1-H10. 52 percent of the problems and 59 percent of the recommendations appeared to be based on a combination of Both of the sets of heuristics - that is, they appeared to be based on at least one heuristic from the set H1-H10 and at least one heuristic from the set H11- H13. Thus, the three new heuristics in the set H11-H13 appeared to have made a unique contribution, independent of contributions from other heuristics, in a sizable percentage of both problems and recommendations.

Figure 1. Problems and recommendations based only on one or more heuristics from the set H1-H10, from the set H11-H13, or from both sets.

Average Yield.

We also considered these results in terms of the average "yield" (problems or recommendations per heuristic) for each set of heuristics [3]. For usability pro- blems, the average yield of the new heuristics (H11-H13) was 5.0 percent, which compares quite favorably with the 3.3 percent average yield for the 1992 set of heuristics (H1- H10). For recommendations, the average yield of the new heuristics was 3.3 percent, as contrasted with the average 3.1 percent yield of the 1992 set of heuristics (Table 1).

Table 1. Mean yield/heuristic and importance rating for problems and recommendations based on each set of heuristics

Importance.

Finally, we scored each problem or recom- mendation in terms of its importance, using a scale from 1 (most important) to 5 (least important). Five members of the team, excluding the human factors member but including user members, participated. The problems based on the new heuristics (H11-H13) were rated as slightly but significantly less important than those identified based on the old heuristics (H1-H10) or on both sets (F[2,233]=3.50, p<.04) (Table 1). However, the recommendations were not rated as significantly different (F[2,83]=.71, p>.40). There were no significant differences in the numbers of recommendations that were either accepted (X2(2)=2.14, p>.30) or implemented (X2(2)=.63, p>.70).

CONCLUSION

These results show that the three heuristics based on issues of skill, quality, and quality of work life can make a sizable contribution to heuristic evaluation, supplementing or anticipating more qualitative approaches. Future work will integrate these new heuristics with Nielsen's newly revised set of heuristics [4], and will explore the value of one additional heuristic that we did not use in this study: H14. Protect the user's privacy

Acknowledgments

We thank Carrie Rudman, Jakob Nielsen and Chris Plott for thoughtful discussions.

References


[1] Bloom, C. P., Bell, B., and Linton, F. (1994). The Learn, Explore and Practice intelligent tutoring systems platform. In M. M. Tanik, W. Rossak, and D. E. Cooke (eds.), Software systems in engineering. New York: ASME.
[2] Floyd, C. (1987). Outline of a paradigm change in software engineering. In G. Bjerknes, P. Ehn, and M. Kyng, (Eds.), Computers and democracy: A Scandinavian challenge. Brookfield VT: Gower.
[3] Muller, M.J., Dayton, T., and Root, R.W. (1993). Comparing studies that compare usability assessment methods. INTERCHI'93 Adjunct Proceedings 185- 186. Amsterdam, April 1993.
[4] Nielsen, J. (1994). Enhancing the explanatory power of usability heuristics. Proceedings of CHI'94. Boston: ACM, 152-158.
[5] Nielsen, J., Bush, R.M., Dayton, J.T., Mond, N.E., Muller, M.J., and Root, R.W. (1992). Teaching experienced developers to design graphical user interfaces. Proceedings of CHI'92. Monterey CA: ACM, 557-564.
[6] Nielsen, J., and Molich, R. (1990). Heuristic evaluation of user interfaces. Proceedings of CHI'90. Seattle WA: ACM, 249-256.