Models of higher level vision [FRG/FER] ----------------------------------------- [GDB] Neisser's perceptual cycle again - see earlier comments. [fg: not done] [An understanding of higher-level vision is] reported as necessary for continued development of models in synthetic environments (Laird, Coulter, Jones, Kenny, Koss, & Nielsen, 1997). We may use Kosslyn and Koenig (1992) definition: higher-level visual processing involves using previously stored information; lower-level visual processing does not involve such stored information, and is driven only by the information impinging on the retina. Thus, we will not deal here with mechanisms used for finding edges, computing depth, etc. Higher-level vision (HLV) is of interest for military modeling for several reasons: 1. how LTM information will indicate incoming danger or serious change in the environment 2. how it will direct attention 3. how it will integrate various aspects of information, or information occurring at different times 4. how it will be used to facilitate learning 5. how it will be used in planning and problem solving To put it simply, HLV is at the interface between lower-level vision (LLV) and postulated memory entities such as productions, schemata, concepts, etc. At the present time, this interface is poorly understood, perhaps because LLV and LTM knowledge are not understood in a sufficiently stable way. (However, see Kosslyn and Koenig, 1992, for neuropsychological hypotheses about HLV) Most models of cognition such as Soar and ACT (actually, most architectures reviewed by Pew & Mavor) use modeler-coded information, which avoids dealing with the interface between LLV and LTM constructs. Neural nets have been used to go from pixel-like information to features or even higher, but have not been incorporated into high-cognition models. CAMERA (Tabachneck-Schijf, Leonardo & Simon, 1997), and to a certain extent EPAM (Feigenbaum & Simon, 1984; Richman & Simon, 1989), explore ways with which features may be extracted from low-level representation, and may be combined into LTM constructs. This is undoubtedly an area where more research should be carried out. For example, modeling instruction and training requires some theory on how low-level acoustic input merges with low-level visual input and connect to LTM knowledge. Tabachneck-Schijf, H.J.M., Leonardo, A.M., & Simon, H.A. (1997). CaMeRa: A computational model of multiple representations. Cognitive Science, 21, 305-350. [Frank, you should already have the other refs.] below the table summarizing EPAM/CHREST. I've refered to the Pew and Mavor book more than Sloman did. Best wishes to you and to Colleen, -Fernand EPAM/CHREST ----------- ORIGINAL PURPOSE model high-level perception, learning, and memory SENSING AND PERCEPTION visual, auditory perceptual discrimination is in real-time (assuming feature-based description of objects) WORKING/SHORT-TERM MEMORY 4-7 slot STM; in some versions (e.g., EPAM-IV), there is a more detailed implementation of a auditory (Baddeley-like) STM and visual STM LONG-TERM MEMORY Discrimination net. In recent versions, nodes of the discrimination net are used to create a semantic net and productions MOTOR eye movements, simple drawing behaviour KNOWLEDGE REPRESENTATION: DECLARATIVE [KNOWLEDGE] chunks, schemas (templates); (use nodes in discrimination net) PROCEDURAL [KNOWLEDGE] productions; (use nodes in discrimination net) HIGHER-LEVEL COGNITIVE FUNCTIONS: LEARNING chunking, creation of schemas and of productions learning is on-line (incremental) and stable against erroneous data. PLANNING connections between templates are used in planning DECISION MAKING knowledge based SITUATION ASSESSMENT overt and inferred MULTITASKING serial processing; learning done in parallel LONG-TERM MEMORY networks of chunks, schemata and productions RESOURCE REPRESENTATION limited STM capacity, limited perceptual and motor ressources (uses time parameters) GOAL/TASK MANAGEMENT bottom up + 1 main goal per task simulated MULTIPLE HUMAN MODELING potential through multiple EPAM modules IMPLEMENTATION PLATFORM(S) Mac, PC (any system supporting Lisp) Graphical environnement supported only for MAC LANGUAGE Lisp SUPPORT ENVIRONMENT Lisp programming + editing tools Some graphical utilities for displaying eye movements, structure of discrimination tree, task A lot of customized code for each task modelled VALIDATION Extensive at many levels COMMENTS: EPAM models focus on a single, specific information-processing task at a time. not yet scaled up to multitasking situations. Used in high-knowledge domains (e.g., chess, with about 300,000 chunks)