get stuff (eps-files and rtf-source)



Kognitionswissenschaft, due Nov, 6, up to 10 pages.

Implicit rule learning and the power law: A Soar model of skill acquisition in Scheduling

Josef Nerb Frank E. Ritter Josef F. Krems

Dept. of Psychology, University of Freiburg, Niemenstr. 10, D-79085 Freiburg, Germany

School of Psychology, U. of Nottingham, Nottingham NG7 2RD UK

Dept. of Psychology, University of Chemnitz, D-09107 Chemnitz, Germany

(email: nerb@psychologie.uni-freiburg.de, josef.krems@phil.tu-chemnitz.de, frank.ritter@nottingham.ac.uk)

 

 

Summary. The chunking mechanism in Soar has been used in numerous ways. The model presented here, which performs a job-shop scheduling task, uses chunking to learn rule-like behaviour gradually while performing the task. The model learns episodic memory chunks while solving the scheduling tasks. This mechanism demonstrates how symbolic models can exhibit a gradual change in behaviour and how acquisition of general rules can be performed without resort to explicit declarative rule generation, a type of implicit learning. The model fits many qualitative (e.g. learning rate) and quantitative (e.g. solution time) regularities found in previously collected data. The model's predictions were tested with data from a new study where a scheduling task was given to the model and to 14 subjects. The model generally fit this new data with the restrictions that the task is easier for the model than for subjects, and its performance improves more slowly. The model provides an explanation of the noise typically found when fitting a set of data to a power law -- it is the result of learning actual pieces of knowledge that transfer more or less but rarely an average amount. Only when the data are averaged (over subjects here) does the smooth power law appear.

 

Introduction

Soar, a candidate unified theory of cognition (Newell, 1990), is a cognitive architecture designed to model all of human behaviour, including learning. We note here several aspects of the architecture that particularly influence and support learning, and illustrate them with a model, Sched-Soar, which performs a job-shop scheduling task. Sched-Soar learns partial episodic rules that slowly approximate rule-based behaviour. The behaviour of Sched-Soar is consistent with the general regularities which it was based upon, and with new data collected to test it. These will be covered in turn.

Overview of Soar

Soar is an architecture that provides a small set of mechanisms for simulating and explaining a broad range of phenomena as different as perception, reasoning, decision making, memories, and learning. It is this restriction on one hand side and the breath of intended -- but only partially yet realised--applications that renders Soar as an candidate unified theory of cognition (Newell, 1990). Unified does not mean that all behaviour must be expressed by a single mechanism (although Soar uses relatively few), but that the mechanisms must work together to cover the range of behaviour.

There are extensive explanations and introductions to the architecture (Lehman, Laird, & Rosenbloom, 1996; Norman, 1991), with some available online (Baxter & Ritter, 1996; Ritter & Young, 1996), so we will only briefly review Soar, as shown in Figure xxxsoar.

Figure 1: The main processes in Soar. [to be taken from the Psychological Soar Tutorial]

Soar is best described at two levels, called the symbol level and the problem space level. Behaviour on the symbol level consists of applying (matching) knowledge represented as production rules to the current state in working memory (the impasse stack). This level implements or gives rise to the problem space level.

Behaviour on the problem space level represents problem solving as search through and in problem spaces of states using operators. Operators are created and implemented using the symbol level. In routine behaviour, processing proceeds by a repeated cycle of operator applications. When there is a lack of knowledge about how to proceed, an impasse is declared. An impasse can be caused, for example, by the absence of operators to apply, by an operator that does not know to make changes to the state (no-change), or by tied operators (i.e., if there are more than just one operator that is applicable at a time). In an impasse, typically further knowledge can be applied about how to resolve the problem. States S2 and S3 in Figure soar are impasse states. As problem solving on one impasse may lead to further impasses, a stack of impasses often results. An impasse defines its own context for problem solving. The stack will change as the impasses get resolved through problem solving or when new impasses are added.

When knowledge about how to resolve an impasse becomes available from problem solving within the context of the impasse, a chunk (a newly acquired production rule) is created. This acquired rule will contain the knowledge of the higher context that has been used for resolving the impasse as its condition part and the changes that happened during resolving the impasse as its action part. In addition, the condition and action part will be generalized (by exchanging constants with variables). Thus, the chunk will recognise the same and similar situation in future problem solving and -- by applying the chunk's action part -- problem solving will proceed without an impasse. Figure chunking illustrates how chunks are acquired. In this example, the operator compute-sum has been selected, but immediate knowledge is not available to provide the answer. An impasse is declared, that of compute-sum could not change the state. Based on this impasse and its type, knowledge about counting in a counting problem space is proposed, and counting is performed. A result can be returned as the result of compute-sum. When the result is returned, a new rule, Chunk1, is created. It will have as its conditions the information from the higher level context used to solve the impasse. In this case, it will include the name of the operator and its arguments. The action of Chunk1 will be the results passed back to the higher context, in this case, that the answer was 10. In the future, when Compute-Sum is selected with the same arguments, Chunk1 will match, providing the result and an impasse will not occur.

 

Figure 2: chunking. How chunking encodes a new rule. Adapted from Howes and Young, 1997.

Learning Mechanisms in Soar

There have been many learning mechanisms implemented in Soar using the chunking mechanism. The type and meaning of the impasses differentiate them. Many of these have been part of AI models, or cognitive models that have not been tested against human data, with the exception that we know that humans can learn in these ways as well. These include models that learn from instructions (Huffman & Laird, 1995), models that learn through reflection (Bass, Baxter, & Ritter, 1995; John, Vera, & Newell, 1994; Nielsen & Kirsner, 1994), models that learn through analogy (Rieman, Lewis, Young, & Polson, 1994), models that learn through abduction (Johnson, Johnson, Smith, DeJongh, Fischer, Amra, et al., 1991; Krems & Johnson, 1995), and a model that simulates expert to novice transitions (Ritter, Jones, & Baxter, in press) using a mechanism similar to and based on Larkin's (Larkin & Simon, 1981) transition mechanism based on solving sub-goals in a means-ends analysis. Further examples are available in the Soar Papers (Rosenbloom, Laird, & Newell, 1992), and through pointers in the Soar FAQ (Baxter & Ritter, 1996).

There are, however, several models implemented in Soar that learn and that have had their predictions compared with human data. Most of these models are in the field of human computer interaction. Altmann's (Altmann & John, in press) model of interaction suggests that learning is pervasive when interacting with the environment. His model uses learned information to help with searching for objects in an interface. This model suggests that searching in any environment is assisted by learning. His model has been compared with verbal protocols.

Young, Howes, and Rieman have also developed numerous models of computer users that learn. One model illustrates how mappings between tasks and actions are acquired (Howes & Young, 1996). The model learns the action to perform to accomplish a task through external search leading to learning in a multi-pass algorithm. There are also models that learn through external scanning and internal comprehension (Rieman, Young, & Howes, 1996), and through exploration to recognise states and information near the target (Howes, 1994). These all have been compared against empirical phenomena in exploratory learning. The processing is heavily recognition-based, so that the search of a large external space of options can be performed with a small working memory.

Driver-Soar (Aasman & Michon, 1992) drives a simulated car through a residential area, avoiding bicycles and parked and moving cars. It learns plans for activities like negotiating intersections, and it learns how to control the car more accurately. Its behaviour has been compared with human driving data, including the order of behaviours and their times for visual orientation, motor control, and car speed.

SCA (Miller & Laird, 1996) learns concepts by starting with very general classification rules and learns more specific rules. It has been compared with aggregate data of subjects learning to classify novel stimuli.

Diag (Ritter & Bibby, 1997) is a model of troubleshooting to finds faults in a device. It learns which objects to examine and how to implement its internal problem solving more efficiently. It is one of a few models that have been compared with aggregate and verbal protocol data while both subject and model learn.

Chong and Laird (1997) have created a series of models that become better at solving a dual task. The dual task they model combines a tracking task with a choice-reaction time task. They identify many places where learning could occur in this sequence from novice to expert. They also implement a mechanism to learn the ordering and precedence of external behaviour by resolving ties for multiple intended behaviours.

Several, but certainly not all of these models simulate learning that could be categorised as explicit learning (Frensch, 1998), that is, learning in which the learner knows that and why competence and performance have improved. The model we propose, deals with learning in a task in which human subjects improve implicitly, that is, they improve without knowing why and how.

The model's task is a planning task that requires sorting N elements into a sequence that conforms to a given criteria where only one out of the N! possible sequences fulfils the criteria (i.e. finding the best schedule). We will present a model, Sched-Soar, that does the task and improves in performance both qualitatively and quantitatively, and in a way similar to human subjects. First, however, we will describe the task and prior empirical results from subjects solving the task. Those regularities will serve as initial empirical constraints on the model. The validity of the model will be scrutinised in a further empirical study. Utilising model predictions, we will examine subjects' performance, including learning in this domain in a principled, detailed way.

Skill Acquisition in Scheduling Tasks

From a psychological point of view planning can be considered a problem-solving activity in which, in a prospective manner, an ordered sequence of executable actions has to be constructed. In a more formal sense, this means specifying a sequence of operators with well-defined conditions and consequences that transform a given state into a goal state. For interesting problems, the entire problem space cannot be searched, and heuristics must be used to guide the search.

Scheduling problems are a specific, important subset of planing. Here the task of the problem solver is to find an optimal schedule based on the given constraints (e.g. minimal processing time). Factory scheduling (so-called job-shop scheduling) is a further subset of scheduling tasks, namely, to find the optimal ordering of activities on machines in a factory.

Job-shop scheduling has direct, practical importance. Over the last two decades algorithms have been derived that produce optimal (or near optimal) solutions for scheduling tasks using operations research techniques (e.g. Graves, 1981) as well as AI techniques (e.g. Fox, Sadeh, & Baykan, 1989). Other AI-based approaches have used the general learning mechanisms in PRODIGY (Minton, 1990) or Soar (Prietula & Carley, 1994). These systems rely on the assumption that general methods for efficient problem solving can be discovered by applying a domain-independent learning mechanism. In psychology, on the other hand, little is known about how scheduling is performed as a problem solving activity and about the acquisition of scheduling skills (for a counterexample and earlier call to arms, see Sanderson, 1989).

The Job-Shop Scheduling Task

The task -- for the subjects as well as for the computational model -- is to schedule five actions (jobs) optimally, as a scheduler or dispatcher of a small factory. Figure task illustrates the task we used. Thus, one out of 5! possible sequences of actions has to be found. Jobs had to be scheduled on two machines (A and B). Each job had to be run in a fixed order, first A and then B, requiring different processing time on each machine for each job. Sets of five jobs with randomly created processing times were given to the subjects on a computer display. Subjects tried to find the order of jobs that produced the minimal total processing time, determining which out of the five jobs should be run first, which second, and so on.

[insert the figure here]

Figure task. The task solved by subjects and the model.

For this kind of scheduling task an algorithm to find the optimal solution is available (Johnson, 1954). The general principle requires comparing the processing times of the jobs and finding the job requiring the shortest time on one of the two machines. If this is on machine A than the job has to be run first, if it is on B, then last. This principle is applied until all of the jobs are scheduled. Suboptimal sequences result if only parts of the general principle are used, for example, only the demands of resources on machine A are used for ordering the jobs. This special task of modest complexity was selected because (a) it is simple enough to assess the value of each trial's solution by comparing it to the actual optimal solution, but (b) the task is hard enough to be a genuine problem for subjects, who have to solve it deliberately, and (c) to solve the task without errors requires discovering and applying a general principle.

What is Learned in this Task?

Learning to solve scheduling tasks, like learning in general, requires the acquisition as well as the storage of rules in memory. In this task, acquisition is discovering the general rule or inferring at least useful scheduling heuristic rules while performing the task. If no rule on how to schedule jobs is available and the problem solver progresses through blind search, on average no great improvement should occur. Only if the subject generates internal hypotheses about the schedule ordering rules, and if feedback about the correctness of these assumptions is available, will the subject be able to discover efficient scheduling rules. And then, only if a discovered rule is stored in memory will the improvement be applied in later trials. As in impasse-driven learning-theories (VanLehn, 1988), it is assumed that rule acquisition particularly takes place when subjects face a situation in which their background knowledge is not sufficient to solve the problem immediately.

Of course, as in other domains, learning in scheduling tasks depends on the amount of practice and it is highly situated. An essential situational factor, which facilitates or inhibits the acquisition of rules and thus the progress in learning, is the interaction of the problem solver with the environment. In this task, the interaction provides feedback about the quality of the subject's solution and therefore about the efficiency of their applied rule.

In previous experiments investigating this task (Krems & Nerb, 1992), 38 subjects each had to generate 100 schedules. Although the main focus of this work was to investigate the effect of different kinds of feedback on learning, the data also describe some general regularities. There are several important empirical results, shown in Table constraints, that should be used to constrain the design of process models of scheduling skill acquisition.

Table constraints. Important regularities of behaviour on this task taken from Krems and Nerb (1992).

(a) Total processing time: The task takes 22.3 s, on average, for a novice (min-value: 16 s, max-value: 26 s.).

(b) General speed-up effect: On average, the processing time for scheduling a set of jobs decreased 22% from the first ten trials to the last ten.

(c) Improvement of solutions: The difference between the subject's answers and the optimum answers decreased more than 50% over the 100 trials.

(d) Suboptimal behaviour: It should be emphasised that even after 100 trials the solutions are not perfect.

(e) Algorithm not learned: None of the subjects detected the optimal scheduling rules (i.e. nobody could give a verbal description of the underlying principle when asked after the trials), but most subjects came up with some simple ideas and heuristics, that is, they had some partial knowledge of the optimal algorithm.

Sched-Soar

Sched-Soar is a computational model of skill-acquisition in scheduling tasks. The architectural component -- Soar -- is strictly separated from the task-specific knowledge -- in this case the scheduling knowledge that was included. In addition, the model includes a knowledge-based learning mechanism that predicts human learning on this task, and may explain how apparent rule-based behaviour can arise from apparently more chaotic behaviour.

In addition to the empirical constraints, we include the following general constraints that are consistent with or based on the Soar architecture:

(a) The task is described and represented in terms of problem spaces, goals and operators, as a Problem Space Computational Model (Newell, 1990). All knowledge is implemented as a set of productions. Soar's bottom-up chunking mechanism is used for learning, which means that chunks were built only over terminal subgoals. This has been proposed as basic characteristic of human learning (Newell, 1990, p.317).

(b) An initial knowledge set about scheduling-tasks is provided as part of long-term knowledge (e.g., to optimise a sequence of actions it is first necessary to analyse the resource demands of every action). Also, basic algebraic knowledge is included, such as ordinal relations between numbers. Together, this amounts to 401 productions implementing eight problem-spaces.

(c) The model is feedback-driven. If the available internal knowledge is not sufficient to choose the next action to schedule, the supervisor is asked for advice. These are situations in which a human problem-solver would have to do the same, or to guess. This does not mean that there are no impasses in the model, for there are. It means that when the impasse is the inability to select between tied operators proposing jobs to schedule, there is knowledge suggesting asking for outside advise to resolve the impasse.

Processing Steps

Sched-Soar begins with an initial state containing the five jobs to be scheduled and a goal-state to have them well scheduled, but without knowledge of the actual minimum total processing time. The minimal scheduling knowledge that Sched-Soar starts with leads to these main processing steps, which are applied every single scheduling step:

(1) Sched-Soar analyses the situation and tries to find a job to attempt to schedule. The analysis includes computing the order of time on machine 1 that each remaining job takes, and operators to propose scheduling each job next, and how many jobs have been scheduled so far. Previous knowledge may allow the model to proceed directly to step 3, where a learned rule indicates that a particular job should be scheduled next based on how many jobs have been scheduled and the amount of time the job takes.

(2a) If no decision can be made, despite examination of all available internal knowledge, Sched-Soar requests advice from the environment. The advice specifies the job that is the optimal choice to schedule next in the current set.

(2b) After getting advice about which job to schedule next, Sched-Soar reflects on why it applies to the current situation. In doing so, Sched-Soar uses its scheduling and basic arithmetic knowledge to figure out what makes the proposed job different from all the others, using features like the relations between jobs, the resources required by single actions, and the position of an action in a sequence.

(2c) Based upon this analysis, Sched-Soar memorises explicitly those aspects of the situation that seem to be responsible for the supervisor's advice. In the current version, Sched-Soar only focuses at one qualitative aspect of the suggested job, namely the ranking in processing time of the job on the first machine. Restricting what is memorized to just one feature of the situation clearly represents a heuristic that tries to take into account the limited memory and memorising capability of humans.
We call a so built chunk an episodic chunk. Episodic chunks implement search-control knowledge, specifying what has to be done and when. An example is: If two jobs are already scheduled, and three operators suggesting three jobs to be scheduled next are suggested, and job 1 has the shortest processing time compared to the other jobs on machine A, then give a high preference value to the operator to schedule job 1. This kind of memorising is goal-driven, done by an operator, and would not arise from the ordinary chunking procedure without this deliberation. If in subsequent trials a similar situation is encountered, then Sched-Soar will bring its memorised knowledge to bear. Of course, because the memorised information is heuristic, positive as well as negative transfer can result. Consequently, because only explicit, specific rules are created, general declarative rule based behaviour appears to arise slowly and erratically.

(3) The job is assigned to the machines and bookkeeping of the task and experiment is performed.

Sched-Soar's Behaviour

The model's behaviour can be examined like a subject's, individual runs on sets of problems like a simulated subject. Figure 3 shows Sched-Soar's solution time on 4 series of 16 randomly created tasks. Neither the power function (R2 = 0.55) nor a simple linear function (R2 = 0.53) proves a good fit to these data. However, when averaged, Figure 4 shows that these series fit a simple power function well (T = 274.0 * N-0.3 with R2 = 0.95).

figure3
Figure 3: Individual processing times of the Soar model for four sets (Soar I-IV) of simulated data and a power law fit. figure4
Figure 4: Averaged processing time for four sets of simulated data and a power law fit.

 

A closer look at the Sched-Soar's problem solving process shows that the variance in the individual trials comes from two main sources: the negative transfer of chunked knowledge and the number of requests for advice. Negative transfer results when an episodic chunk, built during solving a previous task, suggests an action in a situation that appears similar to the prior one, but which requires a different action. If this occurs, the situation has to be evaluated again to find the proper schedule-element, and, finally, if there is no suitable knowledge, the model still has to ask for advice. This explains why we found in the model's performance that additional requests for advice are often preceded by one or more instances of negative transfer. Both negative transfer and asking for advice directly lead to more deliberation efforts as measured in model cycles.

An assumption in Soar is that the learning rate should be constant. In Sched-Soar, the learning rate (as chunks per trial) indeed proved constant over all 4 X 16 trials (Chunks (t) = 15.48 * t + 417.8; with R2 = 0.98). Note, however, as performance only improves as a negatively accelerated power function, learned chunks get less effective over time.

Comparison of Sched-Soar with Data

The model was evaluated in two ways, by investigating how many of the preexisting, empirical constraints were met, and by comparing the model results to new empirical data in a further study. We take these up in turn.

Comparison with Previous Regularities

The model show that the empirical constraints shown in Table constraints are met in general. (a) Solving the task requires 151 model cycles, averaged over all trials and runs. The Soar architecture specifies the rate of model cycles within half an order of magnitude with a centre point at ten per second. This only constrains the time per cycle to be between 30 ms to 300 ms (Newell, 1990). The model performs slightly faster than the mean expected rate of 100 ms per cycle, but at 147 ms/cycle it is well within the theoretical bounds.

(b) The speed-up of the model in 16 trials is greater than the subjects' -- 56% (from 270 cycles in the first trial to 118 cycles in trial 16) compared with 22% by the subjects from trial one to ten. The speed-up of the model from trial one to ten (132 cycles) was 51% and hence also exaggerates the empirical finding. However, this might be an effect of different modes of receiving feedback (In the empirical study feedback was given after a complete solution, whereas Sched-Soar is advised immediately after every single scheduling level decision). (c) The improvement in correctness cannot be decided yet, because Sched-Soar was initially programmed to use advice to produce always correct solutions. (d) Sched-Soar's behaviour is always suboptimal after 16 trials (and negative transfer might still occur in later trials). (e) The model did not discover or implement the general algorithm.

Sched-Soar accounted and explained the general regularities from the previous study to their limit. To know where to further improve the model, more data is necessary matching the model's interactions and capabilities.

Experiment: Testing the Model's Predictions

To make model and subjects data more commensurable, we conducted an additional empirical study where we tried to assess participants behaviour in solving the task under conditions that were more similar to the ones realised in the simulation studies. For this study, the model results can be considered theoretical predictions about subjects behaviour in learning to solve the task.

Method

The experiment was carried out with 14 subjects, all psychology students receiving course credit for participation. Data were assessed in single sessions with help of a computer implemented version of the scheduling task. Similar to the setting in the simulation, participants could get advice immediately after every single scheduling decision. Subjects were instructed to separate their decisions based on knowledge from those based on guesses: They were requested to ask for advice when they did not know what to do. Each subject solved a total of 18 different scheduling problems. At the end of the experiment, subjects were debriefed and asked whether they had detected regularities within the task.

Results and Discussion

The subjects' learning rate is shown in Figure 5. We found that a power function (T = 109.6 * N-0.38) accounts best for the averaged data (R2 = 0.82, compared with 0.71 and 0.73 for linear or exponential fits). The average processing times for trial 1 to 18 vary between 99.4 and 36.3 s.

As has been noted (Kieras, Wood, & Meyer, 1997), like many cognitive models (e.g. Peck & John, 1992), Sched-Soar performs the task more efficiently than subjects do, predicting values on these tasks between 270 and 116 model cycles. That means one has to assume 369 or 313 ms/model cycle, which is slightly above the region defined by Newell (1990). The learning rate (power law coefficient) of the subjects is only marginally higher than the learning rate of the model (-0.38 vs. -0.30), but well within the same range suggesting that the task was equally complex for subjects and model (see Newell & Rosenbloom, 1981). Thus besides coming temporally close, Sched-Soar accounts also for qualitative aspects of subjects learning.

If these time constraints are taken seriously, future extensions to Sched-Soar should include more of the tasks that subjects must perform and the model does not at the moment, such as reading the jobs off the screen or typing in the schedule. These will take time to perform at the very first trials, but it should offer opportunities for learning, leading to both an increase in average processing time and to a somewhat higher learning rate because these are easier tasks to learn.

Interestingly, none of the subjects has discovered the optimal algorithm for solving the task as their performance and the debriefing session revealed. Since subjects have obviously improved over time, an implicit learning mechanism is suggested.

We also found a correlation of r = 0.46 between processing time and the subjects' requests for advise, due to lack of knowledge or wrong decisions. This corresponds with how the model exhibits negative transfer of chunked knowledge. This correlation must be examined on a finer level before we can note them as equivalent.

figure5
Figure 5: Processing time from trial 1 to 18 for two sample individuals, the average solution time of all subjects, and a power law fit to the average.

Conclusions

Sched-Soar shows in detail how rule-like behaviour can arise out of apparently noisy behaviour, giving rise to a slowly improving performance due to learning while problem solving. This model does not know enough to be or get perfect. Its representation and problem solving knowledge is too weak to do that. It does know enough, however, to get better. People exhibit this behaviour in many circumstances. This model suggests that their behavior can be optimal given the knowledge that they have. Further improvements will have to come with additional knowledge.

How the Power Law Arises

A major claim of this analysis is that the power law of practice will appear when performing the kind of scheduling task used in these studies, but only when the data are averaged over subjects or trials. We saw the model and subjects improve in a non-linear and non-continuous way.

Sched-Soar indicates fairly clearly that the variance in solution time is not due to noise in the measurement process or variance in the processing rate of the underlying cognitive architecture (which might have been proposed for simpler, more perceptual tasks). Sched-Soar indicates that the variance in solution time is caused by variance in how much learned knowledge transfers to new situations. This regularity may be further overshadowed by more deliberation effort. For example, a subject can make simple decisions to ask for advice, or start more elaborate lookahead search to an arbitrary depth constrained only by their working memory and, of course, their motivation. Stripping away this time or replacing it with a constant factor should also yield a power law function, but only when averaged. A more fine-grained analysis would be required (and is possible) to look at the firing of each episodic chunk that lead to negative transfer, comparing this with each subject's behaviour (Agre & Shrager, 1990; Ritter & Larkin, 1994) .

This is consistent with studies showing that the power law of practice applies across strategies (Delaney, Reder, Staszewski, & Ritter, 1998), but it goes further. Sched-Soar suggests that the task improvements arise out of individual learning events. It is non-discrete in some fundamental way. Although there may be other mechanisms that tune existing productions, they are not applicable at these time scales.

How Rule-Like Behaviour Arises

Sched-Soar is one of the first problem solving models to use episodic memory based on learning through chunking to learn a task in a cognitively plausible manner. It shows how the acquisition of rule-like behavior can be achieved in the Soar-architecture through the storage of specific, context dependent rules based on simple heuristics. This, however, demonstrates that Soar's chunking mechanism can be a good model of rule storage based on gradual rule acquisition, allaying some worries in this area (VanLehn, 1991, p. 38).

One bigger question is how to view this mechanism in the context of larger tasks, for example, as a grammar learning mechanism. This model shows how rule-like behaviour can arise gradually from a simple chunking mechanism. The rule-like behaviour is not defined explicitly as a rule or definition, but implemented as an expanding set of rules implementing and arising out of an information processing algorithm. The improvements come from recognising regularities (i.e. the best schedule) based on simple heuristic knowledge and attention to a partial set of necessary features while trying to minimise resources. In this way, it is consistent with previous work to learn grammars through chunking (Servan-Schreiber & Anderson, 1992).

Open Questions

On the other hand, further empirical work will be necessary to answer some of the questions posed by the model. For example, when will a strategy change take place, and will the results be of a local or global nature? As this model is further developed, it should prove useful for explaining other aspects of scheduling behaviour (e.g., the effect of further kinds of feedback on rule acquisition) and provide a possible new approach to constraint-based planning.

References

Aasman, J., & Michon, J. A. (1992). Multitasking in driving. In J. A. Michon & A. Akyürek (Eds.), Soar: A cognitive architecture in perspective. Dordrecht, The Netherlands: Kluwer.

Agre, P. E., & Shrager, J. (1990). Routine evolution as the microgenetic basis of skill acquisition. In Proceedings of the 12th Annual Conference of the Cognitive Science Society. 694-701. Hillsdale, NJ: Lawrence Erlbaum.

Altmann, E. M., & John, B. E. (in press). Episodic indexing: A model of memory for attention events. Cognitive Science.

Bass, E. J., Baxter, G. D., & Ritter, F. E. (1995). Using cognitive models to control simulations of complex systems. AISB Quarterly, 93, 18-25.

Baxter, G. D., & Ritter, F. E. (1996). The Soar FAQ, http://www.psychology.nottingham.ac.uk/users/ritter/soar-faq.html (1.0). Nottingham: Psychology Department, U. of Nottingham.

Chong, R. S., & Laird, J. E. (1997). Identifying dual-task executive process knowledge using EPIC-Soar. In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society. 107-112. Mahwah, NJ: Lawrence Earlbaum Associates.

Delaney, P. F., Reder, L. M., Staszewski, J. J., & Ritter, F. E. (1998). The strategy specific nature of improvement: The power law applies by strategy within task. Psychological Science, 9(1), 1-8.

Fox, M., Sadeh, N., & Baykan, C. (1989). Constrained Heuristic Search. In Proceedings of IJCAI'89. 20-25.

Graves, S. C. (1981). A review of production scheduling. Operations Research, 29, 646-675.

Howes, A., & Young, R. M. (1996). Learning consistent, interactive, and meaningful task-action mappings: A computational model. Cognitive Science, 20(3), 301-356.

Howes, A., & Young, R. M. (1997). The role of cognitive architecture in modeling the user: Soar's learning mechanism. Human-Computer Interaction, 12, 311-343.

Huffman, S. B., & Laird, J. E. (1995). Flexibly instructable agents. J. of AI Research, 3, 271-324.

John, B. E., Vera, A. H., & Newell, A. (1994). Towards real-time GOMS: A model of expert behavior in a highly interactive task. Behavior and Information Technology, 13, 255-267.

Johnson, K. A., Johnson, T. R., Smith, J. W. J., DeJongh, M., Fischer, O., Amra, N. K., & Bayazitoglu, A. (1991). RedSoar: A system for red blood cell antibody identification. In Fifteenth Annual Symposium on Computer Applications in Medical Care. 664-668. Washington: McGraw Hill.

Johnson, S. M. (1954). Optimal two and three-stage production schedules with set up times included. Naval Research Logistics Quarterly, 1, 61-68.

Kieras, D. E., Wood, S. D., & Meyer, D. E. (1997). Predictive engineering models based on the EPIC architecture for a multimodal high-performance human-computer interaction task. Transactions on Computer-Human Interaction, 4(3), 230-275.

Krems, J., & Johnson, T. (1995). Integration of anomalous data in multicausal explanations. In J.D. Moore & J.F. Lehman (Eds), Proceedings of the 1995 Annual Conference of the Cognitive Science Society (pp. 277-282). Hillsdale: Erlbaum.

Krems, J., & Nerb, J. (1992). Kompetenzerwerb beim Löesen von Planungsproblemen: experimentelle Befunde und ein SOAR-Modell (Skill acquisition in solving scheduling problems: Experimental results and a Soar model) FORWISS-Report FR-1992-001). FORWISS, M"ünchen.

Larkin, J. H., & Simon, H. A. (1981). Learning through growth of skill in mental modeling. In H. A. Simon (Ed.), Models of thought II. 134-144. New Haven, CT: Yale University Press.

Lehman, J. F., Laird, J. E., & Rosenbloom, P. S. (1996). A gentle introduction to Soar, an architecture for human cognition. In S. Sternberg & D. Scarborough (Eds.), Invitation to cognitive science, vol. 4. Cambridge, MA: MIT Press.

Miller, C. S., & Laird, J. E. (1996). Accounting for graded performance within a discrete search framework. Cognitive Science, 20, 499-537.

Minton, S. (1990). Quantitative results concerning the utility of explanation-based learning. Artificial Intelligence, 42, 363-391.

Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press.

Newell, A., & Rosenbloom, P. (1981). Mechanism of skill acquisition and the law of practice. In Anderson, J.R., (Ed.), Cognitive Skills and Their Acquisition, pp. 1-56. Hillsdale, NJ: Erlbaum.

Norman, D. A. (1991). Approaches to the study of intelligence. Artificial Intelligence 47, 327-346.

Peck, V. A., & John, B. E. (1992). Browser-Soar: A computational model of a highly interactive task. In Proceedings of the CHI ë92 Conference on Human Factors in Computing Systems. 165-172. New York, NY: ACM.

Prietula, M. J., & Carley, K. M. (1994). Computational organization theory: Autonomous agents and emergent behavior. J. of meOrganizational Computing, 41(1), 41-83.

Rieman, J., Lewis, C., Young, R. M., & Polson, P. G. (1994). "Why is a raven like a writing desk" Lessons in interface consistency and analogical reasoning from two cognitive architectures. In Proceedings of the CHI ë94 Conference on Human Factors in Computing Systems. 438-444. New York, NY: ACM.

Rieman, J., Young, R. M., & Howes, A. (1996). A dual-space model of iteratively deepening exploratory learning. International Journal of Human-Computer Studies, 743-775.

Ritter, F. E., & Bibby, P. A. (1997). Modelling learning as it happens in a diagramatic reasoning task (Tech. Report No. 45). ESRC CREDIT, Dept. of Psychology, U. of Nottingham.

Ritter, F. E., Jones, R. M., & Baxter, G. D. (in press). Reusable models and graphical interfaces: Realising the potential of a unified theory of cognition. In U. Schmid, J. Krems, & F. Wysotzki (Eds.), Mind modeling - A cognitive science approach to reasoning, learning and discovery. Lengerich: Pabst Scientific Publishing.

Ritter, F. E., & Larkin, J. H. (1994). Using process models to summarize sequences of human actions. Human-Computer Interaction, 9(3), 345-383.

Ritter, F. E., & Young, R. M. (1996). The Psychological Soar Tutorial, http://www.psychology.nottingham.ac.uk/staff/ritter/pst-ftp.html (12.). Nottingham: Psychology Department, U. of Nottingham.

Rosenbloom, P. S., Laird, J. E., & Newell, A. (1992). The Soar papers: Research on integrated intelligence. Cambridge, MA: MIT Press.

Sanderson, P. M. (1989). The human planning and scheduling role in advanced manufacturing systems: An emerging human factors domain. Human Factors, 31(6), 635-666.

VanLehn, K. (1988). Toward a theory of impasse-driven learning. In H. Mandl & A. Lesgold (Eds.), Learning issues for intelligent tutoring systems. 19-41. New York, NY: Springer.

VanLehn, K. (1991). Rule acquisition events in the discovery of problem-solving strategies. Cognitive Science, 15(1), 1-47.