Audition Module Guide

A guide to understanding and using ACT-R/PM's Audition Module.

Overview

The Audition Module (AM) gives ACT-R rudimentary audio perception abilities. Unlike the Vision Module, though, the Audio Module does not give ACT the ability to deal with real sounds, but it does allow the simulated perception of audio.

The AM is designed to work similarly to the Vision Module. There is a store of "features" called the audicon, and these can be transformed into ACT chunks by way of an attention operator. "Features" in the audicon are, of course, not things that have spatial extent like visual features, but instead have temporal extent--they are sound events.

Each sound event has several attributes:

Onset: The time that the sound began.
Duration: The run-time of the sound.
Content delay: The amount of time between onset and when the content of the sound is codeable by the auditory system. No information can be extracted before this time.
Recode time: The amount of time (after the content is available) it takes for the auditory system to construct a declarative representation of the sound.
Content: An encoded sound event has some meaningful content. This should be the name of the chunk which represents the sound content, e.g. SEVEN.
Type: Currently, either TONE, DIGIT, or SPEECH.
String: Some sounds will be speech, and thus strings. The string may be stored in this attribute.
Attended: T or NIL, depends on whether the sound has been attended to or not.

Support will later be added for spatial location of the sound, for use in modeling things like dichotic listening.

Basic kinds of sounds currently supported are tones, digits, and speech. The content delay and recode time for tones and digits are uniform (settable via system parameters), as is digit duration. Since speech strings can differ in length, and presumably content delay and recoding time, these must be supplied when sound events are created. Finally, the audicon has a decay parameter (default is three seconds). After a sound event ends and the decay time elapses, the sound event is deleted from the audicon.

Using the Audition Module

There are two ways to use the Audition Module. One of them is more or less parallels vision, which uses tests built into the production syntax to deal with audio. For example, if you wanted the earliest unattended sound in the audicon, this test would find it:

=event>
   isa audio-event
   onset lowest
   time now

Note that the detect-time of the sound (after onset) has to have passed for this to match. To shift auditory attention to it, send an attend-sound command to the Audition Module. Once a sound event has been attended to, after some time (determined by the recode time for that sound), a sound chunk will be created. The base level activation of new sound chunks is controlled by the parameter :sound-base-level. Sound chunks have three slots, type for the type of sound, content for the content, and pitch for the pitch range (high, middle, low). For example, the spoken string "nine" would have a type of digit, a content of "seven", and a pitch of middle (unless the speaker is someone like Kerri Strug, in which case high might be more appropriate). If this was the last attended sound, it could be matched with:

=snd>
   isa sound
   time now
   type digit
   content "nine"

There is another paradigm for audio perception, the listen-for command. It takes the similar parameters to those used by find-sound: currently these are :attended, :onset, and :type. The standard use of this command would be to pass NIL for :attended and LOWEST for :onset, which would cause the AM to encode the earliest sound not already attended. If the appropriate sound event's content is not yet available, the AM's preparation state will be set to BUSY. Once the event's content is available, but not yet recoded, the AM's preparation state will be set to FREE and its execution set to BUSY. This command automatically executes an attend-sound command if an appropriate audio event occurs.

When sounds occur, they can be simulated by creating sound events through the Lisp functions new-digit-sound, new-tone-sound, and new-other-sound. Parameters for these commands are documented in the Parameter Reference.