Table of Contents List of topics. Click on one or scroll past to tutorial body.
We will use a deliberately very simple model to teach the basic concepts in Soar. Suppose you have a robot that can carry out just two actions, EAT and DRINK. Imagine that initially it is both hungry and thirsty, and that its goal is to be not hungry. How can it use its actions to achieve that goal?
The problem is presented to Soar in terms of states. This is done straightforwardly:
This analysis gets implemented in Soar as a problem space model made up of production rules. We will use the model in the exercises incorporated in the tutorial. Answers for these exercises are also in the tutorial, and should only be viewed once you have completed the relevant exercise.
Back to Table of Contents
For a discussion of Soar as a candidate Unified Theory of Cognition (UTC), see Allen Newell's Unified Theories of Cognition (1990). It is important as background, but we will not be explicitly dealing with it in the tutorial, except to note that the book includes arguments and discussion of the virtues of unification:
The intellectual origins of this approach can be found in (among many other sources) the work on Production System architectures from the 1970s onwards -- Newell's 1980 paper on The problem space as a fundamental category of cognition, and his 1982 paper on The knowledge level. Full references for these and other relevant papers are included in the References section of this tutorial.
Back to Table of Contents
A basic overview of the principle components of the Soar architecture:
In Soar, all behaviour is seen as occuring in a problem space, made up of Goals, Problem Spaces (P or PS), States (S) and Operators (O or Op). In earlier versions of Soar these were all explicit choices.
In the current NNPSCM Soar (Soar7), the Goal and Problem Space are treated as part of the State. Because of this, the State -- especially when thought of as containing the Goal and Problem Space -- is sometimes referred to as a "Context". In this tutorial, we will continue to use the term "Problem Space" sometimes in a deliberately loose way, to mean the Context.
There can be several problem spaces (i.e., contexts) active at any one time. Each may lack some required knowledge, and be able to provide knowledge to other contexts. The main idea behind splitting knowledge into problem spaces (contexts) is that it reduces the search for information. The idea has also been used as a software design technique because it is a successful way to partition knowledge.
Fluent behavior is repeated application of operators.
No knowledge => No behaviour.
In order to act in a domain, Soar must have knowledge of that domain (either given to it or learned).
It's useful to divide domain knowledge into two categories:
Given just the basic problem space knowledge, Soar can proceed to search using it. But the search will be "unintelligent" (e.g., random or unguided depth first), since by definition it does not have the extra knowledge needed to do intelligent search.
Important basic problem space knowledge centres round the operators: when an operator is applicable, how to apply it, and how to tell when it is done.
Soar comes with a set of default rules that, among other things, provide a minimal, domain-knowledge-free response to each possible type of impasse. Without them, Soar essentially crashes if it hits an impasse for which it has no domain knowledge. With the default rules, it at least survives, either through some abstract planning or default searching for knowledge, or simply waiting with a WAIT operator.
Those minimal default rules can also be seen as playing an important theoretical role. Without them, Soar is really a rather weird kind of production system interpreter. With them, it becomes a problem-solving architecture.
The default rules unfortunately contain a mess of other things: some simple behaviour to indulge in when nothing else is specified; rules to support certain coding conventions; a technique for handling tie impasses; and so on. In this tutorial we will make little use of the default rules, and will not examine them in any detail.
Now try *Exercise 1*, which will show you what Soar does with and without the default rules.
Although we think of Soar as operating conceptually at the problem space level, its behavior is realised by encoding knowledge at the more concrete symbol level. In this tutorial we will mainly concentrate on how to realise the problem space level at the symbol, or programming, level.
At the symbol level, each context has two context slots, one for the state and one for the operator. Note that the problem space associated with the state is represented as an attribute of that state, rather than having an explicit context slot of its own (as was the case in earlier versions of Soar).
The display on the right is in "watch 1 trace format", a term we will return to later on.
Remember, in Soar, as a problem space architecture, we said that within the problem space there is a current state that gets modified by the application of operators. It is the context slots that tell us at any one time which is the current state for each context (and so which problem space we are in, since it is an attribute of the state), and which operator we are applying.
Each major cycle in Soar (called a decision cycle, see later) ends with some kind of change to the context stack. If the knowledge available to Soar (i.e., the productions) specifies a unique slot filler (such as the next operator for some context) then that change is made. Otherwise, an impasse arises because the immediately applicable knowledge is insufficient to specify the change. For example, in the current state there may be no operators to apply, or we may not be able to choose between several operators.
You may have noticed this context slot filling behaviour during Exercise 1. Alongside each slot (e.g., S1) will be the name of the assigned state or operator.
Knowledge in Soar is encoded in production rules. A
rule has conditions on its Left Hand Side (LHS), and
actions on the Right Hand Side (RHS):
C --> A.
Two of Soar's memories are of relevance here: the production memory (PM) or long-term memory, permanent knowledge in the form of production rules; and the working memory (WM), temporary information about the situation being dealt with, as a collection of elements (WMEs).
The LHSs of productions test WM for particular patterns of WMEs. Unlike most other production systems, Soar has no syntactic conflict resolution to decide on a single rule to fire at each cycle. Instead, all productions whose conditions are satisfied fire in parallel.
For example, the following rule proposes an operator 'eat' if we are hungry and desire not to be hungry.
^desired ^name hungry-thirsty)
Translated into 'English', this rule would be:
If we are in the hungry-thirsty problem space AND we desire to be not hungry AND the current state says we are hungry then propose an operator to apply to the current state AND call this operator 'eat'
The rule firings have the effect of adding elements to WM (as we shall see), so that yet more productions may now have their conditions satisfied. So they fire next, again in parallel. This process of elaboration cycles continues until there are no more productions ready to fire. That is quiescence.
Soar uses an attribute-value scheme to represent information in WM. Thus a conceptual object that stands for a large red block might be represented as:
All attributes are marked by a ^ (called an up-arrow or caret) and their values directly follow them. Note that the attribute-values do not have to be in any specific order, so we could have put ^isa block last in the list. Attributes can be added to working memory by the application of rules whose conditions have been satisfied.
Attributes commonly have just a single value, though they can have multiple values. However, there are both theoretical and practical reasons for avoiding multiple values whenever possible.
The operator, which is a context slot, is represented as an attribute of a state symbol, though as we shall see (when we come to talk about the decision cycle) the slot gets its value only after quiescence is reached, that is. when there are no more productions to fire. The context slots (state and operator) always have at most one value.
Within a production rule, symbols like <x> are variables, and so can be used to link one object to another. For example, in the clauses below, <y> is used to reference one object from another:
In Soar, each of these variables will be bound to a unique, internally generated identifier (ID) such as x35, which represents the object, though we need not concern ourselves with this since we primarily reference objects through variables such as <s>. However, the ID's appear in the trace (which we will come to later), and can also be used to print objects for inspection.
The variables used for the attribute values do not necessarily have to take the initial letter of the attribute name (such as <o> for the ^operator value), but it will make your productions easier to understand, and hence you should take some care in naming them.
Now try *Exercise 2* in order to familiarise yourself with the hungry-thirsty model.
We need to look a little more closely at the way rule firings change Working Memory (WM).
First, we look at changes to "ordinary" WM Elements (WME's), that is, ones other than the context slots (state and operator).
With parallel rule firings, it is important that rules are not able to change WM directly, else there could be inconsistencies in WM and faulty knowledge could pre-empt correct knowledge.
So, in Soar, the RHS's of rules do not make changes directly to WM. Instead, they vote for changes to WM by producing preferences.
After each cycle, all the preferences are examined by a decision procedure that makes the actual changes to WM.
This means we have the idea of an elaboration cycle, which is a single round of parallel rule firings, followed by changes to the (non-context) WM:
The most important changes are those to one of the context slots (i.e. S or O). To gather all the available knowledge, Soar runs a sequence of elaboration cycles, firing rules and making changes to WM (which may trigger further rules) until there are no more rules to fire, i.e. until quiescence is reached. Only then does it look at the preferences for the context slots.
So we have the idea of a decision cycle (not to be confused with the decision procedure mentioned above), consisting of a number of elaboration cycles, followed by quiescence, followed by a change to some context slot (or by the creation of a subgoal if the preferences don't uniquely specify a change to the context slots):
Decisions for ordinary WM changes are made at the end of
each elaboration cycle.
Decisions for context slot changes are made only after quiescence, which is at the end of each decision cycle.
Soar runs by executing a sequence of decision cycles or elaboration cycles (depending on what command it was given) until told to stop or the number of cycles specified in the command.
We can see how elaboration cycles and decision cycles take place from the example below using run:
This is a run of the hungry-thirsty model, where we are watching rule firings only. Note that 'run 1' tells Soar to run for one elaboration cycle, which as we can see may complete a decision cycle if quiescence is reached. The decision cycle is denoted by the number alongside it, so we can see that the first decision cycle ('0') was to select a state (with no name associated with it) for the context slot. After this, there are elaboration cycles to vote for which operator should be selected for the empty operator context slot. The second decision cycle ('1') then selects the 'eat' operator to fill the operator context slot, which is about to be applied to the state so that it becomes modified. Such processing will continue until the goal is satisfied, or until every possible manipulation has been made without success, at which point Soar will halt having failed to satisfy the goal.
Now try *Exercise 3*, where you can examine the different detail levels within Soar using commands such as 'watch'.
We now go down yet another level and examine the syntax of the rules.
Consider a rule for the following statement:
"In the blocks world, if one block is on top of another block of a different colour, then propose repainting the lower block to be the same colour as the upper block".
When translated into a Soar model, it looks like this:
^desired ^name blocks-world)
Many of the basic rule features are shown in this rule:
Sometimes the syntax appears to get in the way by being verbose, but with practice rules do become easier to read.
There are shorthands for several common constructs. We can just give path names to access local structures. For example, using some of the shorthand available, the above rule begins:
When Soar encounters an impasse in context level-1, it sets up a subcontext (or "subgoal") at level-2, which has associated with it a new state, with its own problem space and operators. Note that the operators at level-2 could well depend upon the context at level-1.
The goal of the 2nd level context is to find knowledge sufficient to resolve the higher impasse, allowing processing to resume there. For example, we may not have been able to choose between two operators, so the level-2 sub-goal may simply try one operator to see if it solves the problem, and if not, tries the other operator.
The processing at level-2 might itself encounter an impasse, set up a subgoal at level-3, and so on. So in general we have a stack of such levels, each generated by an impasse in the level above. Each level is referred to as a context (or goal ), and each context can have its own state, problem space and operators.
Notice that the architecture's problem solving approach is applied recursively at each level.
Soar automatically creates subgoals in order to resolve impasses. This is the only way that subgoals get created. (Note, therefore, that production rules never vote directly for states. The only context slot rules vote for is for operators.)
What types of impasses are there? Roughly, for the operator slot, there can be:
The most common kinds (you're unlikely to meet any others) are:
Each kind of impasse, for its straightforward resolution, requires a particular kind of knowledge:
Interestingly, there are three possible reasons for an ONC:
It should be noted that there are other ways to deal with impasses, such as rejecting one of the context items that gives rise to it.
Now try *Exercise 6*, where you will examine an impasse.
Soar includes a simple, uniform learning mechanism, called chunking. Whenever a result is returned from an impasse, a new rule is learned connecting the relevant parts of the pre-impasse situation with the result. This means that next time a sufficiently similar situation occurs, the impasse is avoided.
The above diagram shows the WMEs that are created during an impasse, the resolution being the addition of the WME R, which solves the sub-goal. Multiple incoming arrows denote AND, so for WME 2 to be created, WMEs C and 3 must exist. Notice that R is dependent upon the existence of WMEs 3 and 4, so if we work our way back, we find that these elements require A, B and D only. This is why the chunk created will have these in its condition, and R in its action. You may think that C should also be included, since it is required in order for D to be added. However, D already exists at the time the impasse occurs, so it is not necessary to include C.
Chunks are formed when Soar returns a result to a higher context. The RHS is the result. The LHS are things that have been tested by the linked chain of rule firings leading to the result, the set of things that exist in the higher context ("pre-impasse") on which the result depends.
Identifiers are replaced by corresponding variables (and certain other changes are made).
Just as each kind of impasse, for its straightforward resolution, requires a particular kind of knowledge, so also it gives rise to a characteristic kind of chunk:
For the three varieties of Operator No Change:
Problem solving and chunking mechanisms are thus tightly intertwined: chunking depends on the problem solving and most problem solving would not work without chunking.
Now we are at the point where, if we can model performance on a task in Soar, we expect to be able to model learning (cf. position in Cognitive Science until just recently).
Even when no chunk is actually built, an internal chunk called a justification is formed. This can happen because learning is turned off, or bottom up (a learning state that only learns from the bottom problem space), or because the chunk is a duplicate of one that already exists, or whatever.
Justifications are needed in order to get persistence right; for example, to provide support (i-support or o-support, which are covered later) within Soar's truth maintenance system.
Now try *Exercise 7* where you will be able to see a chunk being created.
Writing models in Soar typically does not proceed from scratch. Typically new models are built by copying old models and modifying them. There are also templates for the common actions in a problem space:
Now try *Exercise 8* where you will create a new problem space to implement an operator.
Information in WM is supported by a kind of Truth Maintenance System (TMS). When the conditions of a rule are satisfied, it fires, and produces various preferences. When the conditions become untrue, the rule (instance) is retracted, and the preferences may be retracted too. WMEs supported by those preferences may disappear from WM.
This issue of persistence is potentially complicated and confusing (and a source of subtle bugs). For now, we just take a simple view.
Most of the time, with cleanly written rules the issue of persistence takes care of itself, and you do not have to worry about it.
Now try *Exercise 9*, which looks at o-support and i-support.
One of the strengths of Soar is that the correspondence between the model and psychology is pinned down, not free floating. Newell explains this in detail in his Unified Theories of Cognition (1990) book. One of the most straightforward ways of achieving this correspondence is in terms of timescales, for example:
Newell explains this topic in detail in his Unified Theories of Cognition (1990) book. A shortened list is provided here.
You are now ready to move onto other modules in the psychological Soar tutorial. Take a look at the further readings and references that follow, and then go BACK a level or two to find other parts of the tutorial.
Newell, A. (1990). Unified Theories of Cognition. Cambridge, MA: Harvard University Press.