ritter, gdb @psyc.nott.ac.uk
Phone: ++44 (115) 951 5302; Fax: 951 5324
ESRC Centre for Research in Development, Instruction and Training
Technical Report No. 40
Department of Psychology, University of Nottingham, Nottingham NG7 2RD, U.K.
This document is available from http://www.ccc.nottingham.ac.uk/pub/soar/
Any errors remain the fault of the authors, and an acknowledgement here does not indicate an endorsement of this report from those listed.
Support for this work has been provided by the DRA, contract number 2024/004, and by the ESRC Centre for Research in Development, Instruction and Training.
We describe here a revision of a previously presented cognitive model. In this revision, we do not expand the model to cover more data directly, but to be more useful. We do this by (a) making the model easier to understand and (b) making the model easier to apply to additional data, that is, reuse it. We believe that including these two features will set a new standard for models, and further fulfils the promise this model initially offered, that of providing an account of knowledge application in formalisable domains.
We are using a model of physics problem solving called Able (Larkin, 1981; Larkin, McDermott, Simon, & Simon, 1980b). Able solves kinematics problems by applying physics principles. It initially uses a backward-chaining, means-ends analysis to find which principles to apply, starting with the target variable; after learning, it ends up with a more expert-like behaviour, solving problems without search. While it does not model the complete process, such as performing the algebraic manipulations or learning the principles, it is one of the best models of novice to expert transition in physics problem solving.
Able was initially written by Larkin as two related models: ME to simulate novice physics problem solvers (barely able); and KD to simulate expert problem solvers (more able) in kinematics (Larkin, et al., 1980b) and fluid statics (Larkin & Simon, 1981). These models were compared with problem solving protocols, which they matched very well. The models were later unified by a learning mechanism that learned while solving problems, showing how the novice model could learn to become an expert model (Larkin, 1981). This unified Able model was translated into Soar 4 by Levy (1991), essentially showing that the learning mechanism used by Larkin was very similar to the chunking mechanism in Soar (Newell, 1990). Levy's work remains an interesting example of how quickly someone can learn and model in Soar, for he wrote it in two weeks. His model is where we started.
There are several things that we wanted from such a modelling utility. We want it to match or explain some data and make predictions about additional behaviour. Able already does this. We also wanted to include learning, particularly learning that could implement the transition from novice (backward-chaining) behaviour, to expert (forward-chaining) behaviour (e.g. Klein, 1989). Able already does this as well.
We also wanted the model to be easier to understand and to explain. In order to do this, we created some graphic displays in the scripting language associated with Soar. Some of these displays are specific to Able, and some of the displays can be used with other models. When we previously reused Able, reusing it was not difficult, but it was not as direct as we might have liked. The bulk of our effort reported here went into regularising Able's structure so that it could provide a general mechanism that could be routinely reused.
Lastly, one thing you might want from a cognitive model is the model itself. If a unified theory of cognition is going to provide an approach that supports integrating models, the models must be available. The clearest way to support this is to put the documented model in a public place. Our version of Able, Able, III, with its associated displays and principle application utility, is available from http://www.ccc.nottingham.ac.uk/pub/soar/nottingham/. It is useful as an example Soar model, as a utility for creating additional models by adding principles. The general graphic displays will be useful to anyone creating, examining, or teaching Soar models. It currently only runs with the main Soar 7.0.0 beta release.
Figure 1 shows the operators and their relationships. After a problem has been retrieved with FETCH-PROBLEM, problem solving proceeds with a top-level operator proposing to solve the problem. DEVELOP-KNOWLEDGE will later implement single inference steps that directly solve the problem, but initially, nothing can be done, and an impasse is noted by the architecture. In this impasse, the target variable is selected as the variable to solve. Operators are proposed to apply each of the principles on the state. There is some fairly powerful heuristic knowledge included, that not all problem solvers have (but the subjects covered here did have). APPLY-PRINCIPLE operators that do not have many unknown variables are preferred, but more importantly, operators that propose principles including the target variable are preferred. Principles with the same number of unknowns and relationship to the target are made equivalent. If additional domain knowledge about which principles to prefer was available, it could apply here.
Figure 1. The structure of the operators in Able. Arrows indicated order of application and relationships in the hierarchy. Ellipses (...) indicate that multiple applications may occur of the previous operator.
The implementation of APPLY-PRINCIPLE operators is not initially available, for it, too, is learned. Another impasse occurs, and lower level CHECK-VARIABLE operators check each of the variables in the principle. If all but the target are known, then the target can be derived, and this is passed up to the higher operators. If variables other than the target are not known,
DEVELOP-KNOWLEDGE is applied recursively, with the unknown variable as a target. This leads to backward-chaining behaviour that is typical of novices in this domain (Larkin, McDermott, Simon, & Simon, 1980a; Larkin, et al., 1980b).
During problem solving, chunks (new, learned productions) are created that encapsulate the essential aspects of the impasse and the result that was used to resolve the impasse. These chunks allow APPLY-PRINCIPLE to be applied atomically when similar circumstances occur. With additional problem solving, for the bottom-most chunks must be learned first, the derivation of unknown variables from known variables occurs with the DEVELOP-KNOWLEDGE operator as well.
Learning changes how Able solves problems. With enough practice, fully learned behaviour occurs with DEVELOP-KNOWLEDGE solving problems directly through application of the learned rules, in a forward-chaining way, using the existing known variables to derive additional known variables. The model changes from being driven in a goal-directed way to apply principles to derive the target variable, to being data-driven, where the known variables are used to directly derive additional known variables.
Practice also drastically speeds up how long it takes Able to solve a problem. In our implementation, Able initially takes 27 model (decision) cycles to solve a typical problem (number 5) on the first attempt. This time includes time to find the principles to apply, to check each of the variables, and recursively apply principles when they are necessary. After practice over 7 trials with the same problem, solving the problem no longer improves and it takes just 2 model cycles. The learning curve that is generated does not fit a power law, but it is difficult to comment further, because there are multiple aspects of the task not yet included in the model.
Able's novice/expert performance characterisation is similar in several ways to Klein's (1989). model of recognition primed decision making (which might more correctly be called "recognition-led problem solving in interactive tasks"). Like Klein's model, Able after learning works forward from known information; it is based on previous problem solving; and it does not consider alternative actions. Able is different in that it is spelled out in enough detail to implement some of the structural details of behaviour but in a limited area, whereas Klein's model remains a descriptive model that has been broadly applied.
Able has been applied only to formal domains so far, those "involving a considerable amount of rich semantic knowledge but characterised by a set of principles logically sufficient to solve problems in the domain" (Larkin, 1981, p. 311). Mathematics, physics and sophisticated games (e.g. chess) are formalisable; biology and English literature are not. Whether Klein's domains (e.g. of fire fighting) are formal or can be formalised is unclear. The field of cognitive science would assume that they could be, and attempts to build expert systems in these areas are consistent with that belief. Able suggests that it may be possible to create cognitive models that not only perform the task, and by the way it improves through performing the tasks Able also starts to explain the expert/novice differences that Klein reports.
The relative ease with which Able was translated shows that the basic Soar architecture has not changed much since 1989 in the aspects that Able relies upon. While the functionality has basically stayed the same (Able-Soar, Jr. solved 13 unique dynamics problems, Able, III, solves 16), the number of rules has slightly decreased from 52 basic rules (excluding monitoring and problem generating rules) to 47 rules. It is not the case that the rules have become more complicated, for the number of clauses has more dramatically decreased from 400 to 218. These changes suggest that the Soar architecture has become simpler without substantially changing, which is what its architects have endeavoured to do (Laird, Huffman, & Portelli, 1990).
This update indicated that Able was not fundamentally affected by the changes in the architecture in the last five years. One of the valid criticisms suggested by Cooper and Shallice (1995) was that as the Soar architecture was modified, older models must be carried forward for their results to remain valid. This has not typically happened each time the architecture has been released as new software. The Soar community has not been convinced of the need, because they understood the changes, and theoretically the changes have nearly always been small with limited impact on existing models. Able is a relatively straightforward model, but the absence of problems suggests that the approach Cooper and Shallice put forward did not correctly classify changes, and the changes they have typically noted are implementation choices rather then changes in the theory. More complicated models, however, may have a greater chance of suffering from architectural changes.
We have created two types of displays for working with Able, III. Some of these displays will be useful for developing and explaining nearly any Soar model, and some specifically when working with Able, III.
The continuous goal stack display, shown at the top of Figure 2, indicates the order of operator applications and the current goal stack. Users can display the substructure of objects in the stack in a separate window (not shown). Users can also select how much detail is displayed, choosing to print several layers of substructure by default, or continue to examine substructures by hand.
The continuous match set display, shown at the bottom of Figure 2, provides a display of the rules that have matched the current working memory, and will fire in the next model cycle. Users can print them in a separate window for inspection.
Only anecdotal evidence about the usefulness of these displays is available so far. When they were introduced to a graduate class on programming cognitive models, all of the students used them (whereas they would not use other available tools for writing Soar models).
Figure 2. The general TSI display windows included with Able, III, showing the current goal stack (top window), and the rules that will fire next (bottom window).
Figure 3. The problem display in Able, showing the problem (as text in the top pane) and the current status (known/unknown) of the variables. The target variables, Time spent (t) and Distance (x), are in raised text on the screen, which appear here as underlined text.
The second display, shown in Figure 4, indicates the order of principle application. It shows that the Able model when it is a novice (really, an apprentice, since it knows something), before it has learned, works backward from the target variable. The more expert Able, after it has solved problems and has nearly doubled the number of rules it has, does not appear to apply principles at all, but works forward immediately deriving what is known. This display is based on the application of principles, so it will work with any set of principles represented with Able.
Additional problems can be included by representing their features on the top state. New principles are represented one per production rule. The existing mechanisms in Able can then solve the problem. Additional knowledge can be added, but the weak methods of search in Soar will otherwise solve the problem if it is solvable.
This mechanism could be used to model novice-expert transitions in other domains, or at least provide a way to include routine learning in models. With any set of principles, initial behaviour of the model to choose and apply the principles will be effortful and susceptible to dead-ends. With practice, the model's performance will become situation driven and faster. This approach may make it easier to create models in Soar by providing a mechanism that more closely approximates the highest conceptual level, the knowledge level (Newell, 1982).
Figure 4. The Application of principles display in Able shows the order that principles are applied. With practice on this problem, explicit reference to principles will disappear.
Problems remain with using Able as a utility, however. It was developed to model behaviour in formal domains. Not all domains are formal. It models the novice-expert transition in well under 100 trials, which normally takes years of practice. The transition that is modelled, the order in which to apply principles, may be learned this quickly, but then the model is not modelling the gamut of knowledge that makes up an expert. The principle application mechanism is also unrealistic in the way it uses working memory. It keeps the problem and all the principles on the top state, which is not appropriate. These flaws should not be taken as reasons to reject it, but rather clear indications about where it can be improved.
The other addition to cognitive modelling that Able, III proposes is the explicit need to abstract and export the fundamental mechanism for inclusion in other models, even when working within a cognitive architecture. Here, the principle application mechanism becomes a utility as a new programming language. This is an important exemplar, for cognitive models, as sets of knowledge, should be reusable, including their knowledge based mechanisms.
How best to document and explain the model's code remains an open problem (e.g. as in "Soarware engineering", a term coined by David Steier). We have taken some care to document and explain the model as a program. Good practice in this area has not fundamentally changed in the last five years. That is, there are no established standards and no well accepted best way to explain the model as productions. With Richard Young, we have tried creating "illuminated code", which includes embedded HTML commands to illustrate code and provide additional information (an example is available in http://www.psyc.nott.ac.uk/users/ritter/pst/analogy/answers/anal-ans4.html). Having illuminated code was, in the end, quite useful for teaching, but in our limited experience not good for writing and extending live code.
We believe that for cognitive modelling to succeed, not just survive, we will have to prepare more models in this style. Models must become more easy to understood, easy to extend, and easy to reuse. Displays and packaging programs as utilities are a way to do this.
Bass, E. J., Baxter, G. D., & Ritter, F. E. (1995). Using cognitive models to control simulations of complex systems. AISB Quarterly, 93, 18-25.
Congdon, C. B., & Laird, J. E. (1995). The Soar User's Manual, Version 7. Ann Arbor, MI: Electrical Engineering and Computer Science Department, U. of Michigan.
Cooper, R., & Shallice, T. (1995). Soar and the case for unified theories of cognition. Cognition, 55, 115-149.
Hucka, M. (1994). The Soar Development Environment. Ann Arbor, MI: Artificial Intelligence Laboratory, U. of Michigan. Also available through http://www.cs.cmu.edu/afs/cs/project/soarwww/soar-archive-software.html.
Klein, G. A. (1989). Recognition-primed decisions. In W. B. Rouse (Ed.), Advances in Man-machine Systems Research (Vol. 5). 47-92. Greenwich, CT: JAI.
Laird, J., Huffman, S., & Portelli, M. (1990). Status of NNPSCM and S-support. In T. Johnson (Ed.), Thirteenth Soar workshop. 49-51. THE Ohio State University: The Soar Group.
Larkin, J. H. (1981). Enriching formal knowledge: A model for learning to solve textbook physics problems. In J. R. Anderson (Ed.), Cognitive Skills and Their Acquisition. 311-334. Hillsdale, NJ: LEA.
Larkin, J. H., McDermott, J., Simon, D. P., & Simon, H. A. (1980a). Expert and novice performance in solving physics problems. Science, 208, 1335-1342.
Larkin, J. H., McDermott, J., Simon, D. P., & Simon, H. A. (1980b). Models of competence in solving physics problems. Cognitive Science, 4, 317-345.
Larkin, J. H., & Simon, H. A. (1981). Learning through growth of skill in mental modeling. In H. A. Simon (Ed.), Models of Thought II. 134-144. New Haven, CT: Yale University Press.
Larkin, J. H., & Simon, H. A. (1987). Why a diagram is (sometimes) worth ten thousand words. Cognitive Science, 11, 65-99.
Levy, B. (1991). Able Soar Jr: A model for learning to solve kinematic problems. Unpublished.
Newell, A. (1982). The knowledge level. Artificial Intelligence, 18, 87-127.
Newell, A. (1990). Unified Theories of Cognition. Cambridge, MA: Harvard University Press.
Nichols, S., & Ritter, F. E. (1995). A theoretically motivated tool for automatically generating command aliases. In Proceedings of the CHI `95 Conference on Human Factors in Computer Systems. 393-400. New York, NY: ACM.
Ousterhout, J. K. (1994). Tcl and the Tk toolkit. Reading, MA: Addison-Wesley.
Ritter, F. E. (1993). TBPA: A methodology and software environment for testing process models' sequential predictions with protocols (Technical Report No. CMU-CS-93-101). School of Computer Science, Carnegie-Mellon University.
Ritter, F. E., & Larkin, J. H. (1994). Using process models to summarize sequences of human actions. Human-Computer Interaction, 9(3&4), 345-383.