Some problems with Preece et al. (1994).

Frank E. Ritter

Started 11/97, for passing to Preece at al. at some point.

Further comments welcome.

Preece gets many many many things correct. In 700 pages I have about 9 substantial disagreements, which I think is pretty low, which is why I am happy to recommend Preece. You should really read Preece et al.

There are, however, a few things were I am just amazed, and if you are student I do not want you to learn. Here are those things.

Ch. 3.1 Abstracting GOMS

"One of the problems of abstracting a quantitative model from a qualitative description of user performance is ensuring that the two are connected.... it has been noted that the form and content of the GOMS family of models are relatively unrelated to the form and content of the model human processor...and it also oversimplified human behaviour." p. 66

I would certainly not use the word abstract, since the quantitative model is more concrete, so going from qualitative to quantitative is instantiating, not abstracting. I do not believe that GOMS is unacceptably distant from the MHP, at least not yet. I too admit it should be closer, and critical-path method GOMS, CPM-GOMS is closer, it is not as usable and not necessary for many tasks.

While GOMS simplifies the analysis of some types of human behaviour, I think it is misleading to simply state that it oversimplifies it. It is like disliking Newtonian mechanics because it reduces the lovely stars at night to just numbers. Yes GOMS simplifies, that's the point. The way forward is to use the more advanced methods on p. 67, but those approaches are not yet as easy to use. Students and practitioners needs a simple method that works well until we have a method that is very accurate and learnable (and we've won the battle that it's worth learning about models of the users, which many designers would still, wrongly, disagree with).

The desirability of errors to help learning.

"An error free situation would seem equally undesirable." p. 162

This deserves a much more careful answer. In some situations (oh, if we could only define them), creating errors helps us learn by showing us other operators, other outcomes, and perhaps changes the pace of work to be more interesting. On the other hand, for a crossing light, I see absolutely no useful role of errors for the user. Statements like that have to be qualified, and I hope in the future replaced with a theory to help compute or determine if errors are useful.

"have to make errors to learn" p. 488.

In some situations this is true. Such a statement presented in a blanket way is dangerous (for users and the field) and must be struck out. This is wrong for at least three reasons.

There are other studies where error are a useful part of the learning process (e.g. Anzai & Simon, 1979, just to start at the beginning of the alphabet), but clearly errors are not at all necessary for many tasks. There may be some tasks, however, where they are useful or even necessary.

Ch. 5.1 Learning and unlearning

The book notes that it can be very difficult to unlearn new key mappings, and that it can be dangerous. I must agree that it can be dangerous, but studies by Kevin Singley and John Anderson (1989) make it very clear that while the frustration may be high, the actual impact on performance, at least for novices, it quite small and takes a relatively short time to disappear without any lasting effect.

Ch. 20.3 Cognitive task analysis

The underlying assumption of much cognitive psychology is that a human perceives the world and produces some representation of it in his or her mind (sometimes called 'the problem space'). p. 417

This is a very superficial take on problem spaces. Problem spaces are not usually referred to with respect to the whole world, but used to describe a set of states related to solving a specific problem or task, the operators used to move between states, and an initial state and goal that may be implicit or explicit. The task may be a mental task, it may be external, or it may have components in both places. The direct, internal representation of a user of the real world is probably better described as a mental model.

The representation is what we would usually call knowledge.

If the representation is long term, and is available outside the situation it might be called knowledge. If it is based directly on the state in front of the user/student/subject, then I think it would have to be called a mental state.

Ch. 24. p.488 No cookbook

I disagree that no real cookbook exists and that such a thing will never exist. It is, however, worth looking for a mini-cookbook. There are several good designs for components for interfaces, which are detailed in Preece et al. While they do provide a whole meal, each of the subdesigns provide useful embellishments, sauces, and cooking methods. The Engineering Data Compendium (Boff & Lincoln, 1988) also provides detailed information on specific designs and approaches that could be combined.

Norman notes about standardisation and the ease of automobile use. I think that a list of common features/structures could be created to explain why you can get in an automobile and drive. Software is moving this way, and I think that this is the way forward. Other engineering disciplines have cookbooks (e.g., IC chip design), and some attempts have been made for interfaces (Casner, S., & Larkin, J. H. (1989). Cognitive efficiency considerations for good graphic design. In Proceedings of the Annual Conference of the Cognitive Science Society. 275-282. Hillsdale, NJ: LEA). I don't think it is right or fair or wise to note that they are not possible.

However, a complete cookbook for all of design, is, by definition, not possible.

Interview with Shackel, p. 599

I think Prof. Shackel is not generous to himself to the point of misleading you. He notes things like "I guessed that this ... would cause errors", and "as anyone would". Prof. Shackel has extensive experience, and if a programmer off the street thinks that that they can guess as well or that they can reason like Prof. Shackel without study and/or experience, they are wrong.

Ch. 30.2 Verbal protocols

The references should be expanded and updated. See Ericsson and Simon (1980; 1993).

In Box 30.1, use MacShapa (Sanderson, et al., 1994) instead, it is still supported and available in the department.

This verbal protocol by definition (not 'sometimes') contains the user's utterances while performing (or reviewing) a task. (Protocols can include task actions ordered sequentially, these are non-verbal protocols.) Proper, acceptable verbal protocols contain what is in the user's working memory and what they are paying attention to. Protocols that contain subjects' thoughts on how they think they think are bad because (at least) (a) the subject is not paying full attention to doing the task, (b) their theories about how they think often have useful ideas, but they are often quite wrong as well because such knowledge comes only from observation, which by definition the user will have trouble doing to themselves. Ericsson and Simon provide a fuller description of acceptable practice and pitfalls in this area.

While users are poor at maintaining divided attention, the effect of generating verbal protocols is not too bad a secondary task, and the effects are pretty well laid out in Ericsson and Simon's works. You do not have to think of ways to support users in order to user verbal protocols, but you might expect them to take a little longer and to have just a few more insights than users not talking aloud.

Do not ask a user to "tell us everything that you are doing or thinking about doing." A much clearer and valid phrase is to have them "to say outloud everything that you say to yourself". If subjects talk about what they are thinking of doing, they start to introspect about what they are doing rather than talk about what's in their working memory.

And instead of "what are you thinking now", "keep talking" and "please keep talking" are safer and should work better.

"Post-event protocols" are probably better known as retrospective protocols (but both are good names).

Ch. 30.2 Analysing an example

Ritter and Larkin (1994) provide a nice overview of using process models as summaries of sequences of user actions.

The dismal spreadsheet includes an easy to use data logging package, and the Tcl/Tk Soar Interface does as well. Both are available from me and / or my web pages.

Box 30.2 Variations on interviews

The field of expert systems (e.g. ) would not call what happens in Box 30.2 card sorting. Card sorting is more typically characterised by sorting cards (with objects or their names on them) into piles of objects with similar features, uses or other characteristics.

Ch. 31. The description of experiments is eccentric.

Calfee (1985), for example, would talk about between subjects designs (independent subject designs), matched between subjects designs (matched subject designs), and within subjects designs (repeated measures). Repeated measures in psychology at least, is reserved for studies where multiple measures are taken within a single condition (e.g., "Do you like your mother?", "Do you still like your mother?", "How do you feel about your mother).

Nits

p. 490. users don't memorise 7 items, but remember items. memorising comes after the 7 s presentation only after additional mental processing.

p. 650. Need to explain that control means to measure and set variables in the conditions, or make sure that the values of a variable do not correlate with subjects (or materials).

Should note that analyses should be planned before data is collected to make sure all necessary data is collected and that it is in a form that can be used.

Ch. 13. p. 266 Pie menus. Callahan et (1988) is missing in the references. I have rederived an analysis using Fitts law showing why pie menus are fast.

p. 273 Norman and Draper (1986) is not directly in the references. You can find it in the reference to Norman (1986). It is an interesting and influential book.

In Ch.. 5.1, the use of <ctrl> for the control key is messy, and could probably be done in clearer way.

References

Anderson, J. R., Farrell, R., & Sauers, R. (1984). Learning to program in LISP. Cognitive Science, 8, 87-129.

Anzai, Y., & Simon, H. A. (1979). The theory of learning by doing. Psychological Review, 86, 124-140.

Boff, K. R., & Lincoln, J. E. (1988). Engineering data compendium: Human perception and performance. Wright-Patterson Air Force Base, OH: Harry G. Armstrong Aerospace Medical Research Laboratory.

Calfee, R. C. (1985). Experimental methods in psychology. New York, NY: Holt, Rinehart and Winston.

Ericsson, K. A., & Simon, H. A. (1980). Protocol analysis: Verbal reports as data. Psychological Review, 87, 215-251.

Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data. Cambridge, MA: The MIT Press.

Reder, L. M., & Ritter, F. E. (1992). What determines initial feeling of knowing? Familiarity with question terms, not the answer. Journal of Experimental Psychology : Learning, Memory & Cognition, 18(3), 435-451.

Ritter, F. E., & Larkin, J. H. (1994). Using process models to summarize sequences of human actions. Human-Computer Interaction, 9(3), 345-383.

Sanderson, P. M., Scott, J., Johnston, T., Mainzer, J., Watanabe, L., & James, J. (1994). MacSHAPA and the enterprise of exploratory sequential data analysis (ESDA). International Journal of Human-Computer Studies, 41, 633-681.

Siegler, R. S. (1986). Children's thinking. Englewood Cliffs, NJ: Prentice-Hall.

Singley, M. K., & Anderson, J. R. (1989). The transfer of cognitive skill. Cambridge, MA: Harvard University Press.