Review of General Psychology	© 1999 by the Educational Publishing Foundation
March 2000 Vol. 4, No. 1, 25-58	For personal use only--not for distribution.

Illusory Correlations
A Simple Associative Algorithm Provides a Convergent Account of Seemingly Divergent Paradigms

Klaus Fiedler
Psychologisches Institut University of Heidelberg
ABSTRACT

Subjective correlations that exaggerate objectively presented contingencies are usually referred to as illusory correlations. An empirical review reveals 3 major paradigms of illusory correlations, drawing on 2 prominent but conflicting gestalt principles, congruency and distinctiveness. Congruency accounts for expectancy-based illusory correlations, whereas distinctiveness is relevant to illusions resulting from the asymmetry of positive and negative attributes and from infrequency. The congruency principle implies a processing advantage for expected stimuli, whereas distinctiveness assumes enhanced processing of unexpected events. This apparent conflict is resolved, and an integrative account is offered within a simple connectionist framework (BIAS) of correlation assessment. The basic algorithm is outlined, empirical findings are simulated, new theoretical distinctions are introduced, and analogies to related paradigms are explained.

Illusory correlations have become a prominent research topic not only in modern social psychology ( Crocker, 1981 ; Fiedler, 1985 ; Hamilton, 1981 ) but also in personality research (Shweder, 1975 , 1977a ), diagnostics ( Berman & Kenny, 1976 ; L. J. Chapman & Chapman, 1967 ), learning and conditioning ( Alloy & Tabachnik, 1984 ; Orr & Lanzetta, 1980 ; Shanks & Dickinson, 1987 ), memory ( Arkes & Harkness, 1983 ; Kao & Wasserman, 1993 ), and applied psychology ( Bettman, John, & Scott, 1986 ; Jussim, 1986 ). There are (at least) two reasons for the prominence of this topic.

On the one hand, the detection and assessment of event correlations are among the most fundamental operations an adaptive organism must acquire. Human (or animal) intelligence requires learning of categorization, discrimination (i.e., differential categorization), and correlation (i.e., related discrimination in different dimensions). Note that all three mental operations form a logical hierarchy such that categorization enables discrimination, which then enables the understanding of correlation. Just as in Inhelder and Piaget's (1958) developmental theory of intelligence, the ability to detect, quantify, and use correlations is taken for granted as a basic module in almost every theory of learning, attribution, language comprehension, and inductive reasoning.

On the other hand, the popularity of illusory correlations is mainly due to the challenges and provocations that are explicit or implicit in the term illusory. The intriguing and often pessimistic message is that people's insensitivity to actual correlations can lead to serious biases and shortcomings of social judgment. And these biases, in turn, contribute to stereotyping ( Hamilton, 1981 ), erroneous decisions ( Einhorn & Hogarth, 1978 ), injustice ( Smither, Collins, & Buda, 1989 ), intergroup discrimination ( Hamilton & Gifford, 1976 ), and false clinical judgment ( Dawes, 1989 ).

Scope of the Present Article

The aim of the present article is to present a review and a comprehensive explanatory framework of research on illusory correlations and its intriguing implications in social and cognitive psychology. In several respects, this review is atypical in format and scope. After an examination of the empirical research, a computer-based model is presented that provides an explicit algorithm of the informational processes underlying correlation assessment. Through the use of this integrative framework, within which the diverse variants of illusory correlations can be located, the intention is to convey the following points.

First, research on illusory correlations has evolved in separate paradigms and fields of application, without an integrative conceptual framework. Second, as a result of this lack of a comprehensive framework, there is little theoretical exchange or cross-referencing between major paradigms. Some approaches to illusory correlations, covered under different paradigm labels, have gone largely unnoticed. Third, as a consequence, the theoretical scope and the domain of application have not been fully acknowledged. Fourth, analogous to the separately evolved paradigms, an implicit assumption has been that different and sometimes opposite versions of the illusion are bound to separate task conditions. In this way, fruitful theoretical conflicts could be avoided. Finally, a distinction is highlighted between two sources of bias in correlation assessment: one that is itself biased and originates in expectations, motives, and selective processing and one that is unbiased in nature and not mediated by selective forces or processes. It is argued that explaining biased correlations by biased processes runs the risk of becoming circular ( Wallach & Wallach, 1994 ).

The article is divided in two major parts. The first section is devoted to a review of the three bodies of literature: (a) expectancy-based illusory correlations, (b) differential weighting of positive and negative stimuli in covariation assessment, and (c) infrequency-based illusory correlations. Each subsection starts with an explication of the phenomenon and an empirical review that covers historical origins as well as recent findings and what they reveal about the underlying psychological processes. A final conclusion summarizes well-established findings, mentions open questions, delimits the domain, and attempts to identify the theory heuristic that has guided (and restricted) the research in each paradigm.

Although the review of empirical research yielded a compelling set of well-established findings, with intriguing and often provocative implications, a less satisfactory picture arises at the theoretical level. No integrative theoretical framework is available that connects and coordinates the findings from separate paradigms (for a notable exception, see Busemeyer, 1991 ). Moreover, the psychological assumptions proposed to explain the underlying cognitive processes in the three major paradigms have remained isolated and, in some respects, even contradictory. Thus, the second part of the article attempts to depict the outlines of a unitary framework based on a simple connectionist feedforward model (the Brunswikian induction algorithm for social inference [BIAS] model; Fiedler, 1996 ). Within this framework, different origins of illusory correlations can be distinguished analytically, and previously neglected factors can be identified. Apparent theoretical conflicts can be resolved, and the contribution of traditional paradigms can be located. The algorithm reveals particularly how unbiased processes can lead to biased outcomes and how different types of illusory correlations are not mutually exclusive.

Paradigms of Illusory Correlation: A Review

The aim here is not to provide an exhaustive bibliography. Research on illusory correlations has led to so many publications spread over so many literatures and applied domains that almost no combination of keywords in a bibliographic database would yield a complete list of references. Excellent reviews can be found in earlier publications by Allan (1993) , Alloy and Tabachnik (1984) , Crocker (1981) , Nisbett and Ross (1980) , and Papini and Bitterman (1990) . Recent empirical findings that were not covered by these previous reviews did not alter the major findings and conclusions. Rather than trying to be exhaustive, the present review intends to explain the current state of the art in terms of the historical origins and early theoretical commitments that have restricted research on illusory correlations.

A number of different keywords and combinations thereof were used to gather the bibliographical basis of this review (as summarized in Tables 1 — 3 ). These keywords included illusory correlation, correlation, covariation, and contingency combined with assessment, judgment, and detection as well as in conjunction with distinctiveness, expectancy, and cognitive biases. However, the pooled list of references would be too large to be reviewed and, by the way, would hardly justify the journal space because the majority of articles refer to replications and applications that do not really increase theoretical understanding. Therefore, only a subset of references was included, guided by the goal of pointing out historical origins, uncovering seminal findings that have influenced further research, and doing justice to important theoretical issues. It is unlikely that any selectivity arising from this partly subjective procedure will interfere with the primary goal, namely, to elucidate unresolved issues within a unitary framework.

Defining the Domain of Inquiry

A potentially severe problem is whether the term illusory correlation refers to a unitary phenomenon at all or whether it is but a label that unsystematically connects some phenomena and excludes others that may be equally relevant to covariation assessment. This is, of course, a matter of definition, but definitions need not be arbitrary. The present approach assumes that illusory correlations do, in fact, represent a clearly definable class of phenomena with obvious face validity and a systematic reference to cognitive theory. An illusory correlation is simply defined as a subjective correlation assessment that deviates systematically from an objectively presented correlation. In the simplest, qualitative case, the illusion consists in perceiving a correlation that is actually not there. In the more general case, the definition also includes overestimates, underestimates, reversals, and other distortions of "real" correlations.

One might conjecture that normative statistical models may not provide an ultimate criterion of "real" correlations, but this problem can be ignored in the present context, because the experimental comparisons on which illusory correlations are based (e.g., the same stimulus series presented with different stimulus labels) rarely require a normative criterion. Operationally, the domain is restricted to inductive—statistical assessment or learning of correlations from a series of observations that may be provided by the experimenter or result from the participant's own information search. This definition excludes studies based on summary statistics presented in tables or texts. Thus, the phenomenon under focus is a clearly circumscribed, homogeneous class of cognitive operations: the extraction of a statistical rule from a bivariate (or multivariate) series of stimuli presented over time. Typical of this cognitive task is the interplay of bottom-up (stimulus-driven) and top-down (knowledge-driven) components (i.e., the competition of old knowledge or expectations with new stimulus data).

The three paradigms that are the focus of the next three subsections emerge naturally from this field of two complementary forces. The first paradigm, expectancy-based illusory correlations, highlights the top-down influence of expectations that may override stimulus data. A typical feature of this paradigm is that meaningful stimulus materials are used, and participants' prior knowledge constitutes an essential aspect of the experimental design. In contrast, in the second paradigm, expectations are ruled out so that covariation assessment can be studied as a pure function of stimulus properties (e.g., perceptual variables such as present vs. absent features). In the third paradigm, the emphasis is neither on prior expectations nor on perceptual aspects of individual stimuli; rather, the typical design involves manipulations of the distribution of different stimulus subsets while holding the overall correlation constant.

The very fact that research has mainly evolved in these three paradigms and that the present article focuses on these paradigms is not arbitrary but naturally reflects three facets of the dialectic interplay of old knowledge and new stimuli. Admittedly, one might point to several other paradigms that are also highly relevant to illusory correlations but not commonly associated with this label, such as work on causal and statistical reasoning, attribution errors, or in-group favoritism. That these paradigms are not treated explicitly does not mean that they are neglected or that they would lead to divergent conclusions. Many of these additional phenomena are easily assimilated by the three major paradigms (e.g., in-group favoritism and attribution errors as expectancy-based effects). Others that are of distinct theoretical value are mentioned in later sections.

Expectancy-Based Illusory Correlations

Many judgment and evaluation problems call for the fair and impartial assessment of empirical observations, uncontaminated by the observer's subjective beliefs or wishes. However, the notion of expectancy-based illusory correlations highlights people's inability to keep old expectations apart from new empirical data. Accordingly, the typical research design in this paradigm pits top-down influences of prior knowledge or expectations against bottom-up influences stemming from stimulus data. However, the explicit task is only to assess the correlation in the stimulus data and to disregard prior knowledge.

Review of historical origins and recent state of the art.

Much earlier than in cognitive and social psychology, such illusory correlations were introduced as a challenge to standard procedures in diagnostics. L. J. Chapman and Chapman ( 1967 , 1969 ) were the first to demonstrate illusions in users of projective techniques such as the Rorschach inkblot test and the draw-a-person test. When presented with a series of test results (e.g., picture drawings) along with the patient's diagnosis, diagnosticians report an enhanced coincidence of particular symptoms (e.g., anomalous head) and associated categories (e.g., worry about intelligence), even though these expected pairings did not appear more frequently than unexpected pairings ( L. J. Chapman & Chapman, 1967 ). The diagnostic stereotypes that governed these (mis)perceptions were so superficial and obvious (cf. Shweder, 1977b ) that even laypeople could anticipate the rules used by the expert diagnosticians. After all, associating anomalous heads with intelligence or anal features with homosexuality does not require much professional expertise. Analogous illusions were demonstrated in subsequent years with reference to many other diagnostic procedures, such as the incomplete sentence blank ( Starr & Katkin, 1969 ) and observation techniques ( Berman & Kenny, 1976 ; D'Andrade, 1974 ), the common denominator being that diagnostic expectations mislead diagnosticians to perceive correlations that actually do not exist.

A similar point was made, and most strongly articulated, by Shweder ( 1975 , 1977a ) regarding correlations between personality traits. When trait-relevant behaviors are observed, the trait correlations that are later memorized closely follow the semantic similarity between trait terms. In fact, semantic similarity is a better predictor of reported correlations than the actually observed statistical relations ( Shweder, 1982 ). According to Shweder, the whole endeavor of personality research may thus be grounded on an illusory network that confounds likelihood with likeness.

It is worth noting that the same expectation effects that govern meaningful observation tasks in diagnostics and personality are also possible at lower levels of learning, such as classical conditioning. Garcia and Koelling (1966) have long demonstrated that an unconditional stimulus such as sickness can be easier associated with conditional stimuli in the olfactory modality, whereas electrical shock can be more effectively paired with distinct auditory signals. In a somewhat different vein, it is easier to associate negative than positive facial expressions with an aversive shock ( Orr & Lanzetta, 1980 ).

In recent times, expectancy-based illusory correlations have continued to attract research interest. However, whereas the challenging idea has been extended to many domains, the basic psychological principle has received little modification. As summarized in Table 1 , recent applications concern illusory correlations in clinical (e.g., de Jong & Merckelbach, 1993 ; DiBattista & Shepherd, 1993 ) and organizational (e.g., Camerer, 1988 ; Smither et al., 1989 ) contexts. Other studies have addressed potential moderator variables, showing enhanced illusions under arousal ( Kim & Baron, 1988 ), in older people ( Mutter & Poliske, 1994 ), and when judges' knowledge base is high ( Billman, Bornstein, & Richards, 1992 ). However, these moderator analyses are guided by pragmatic issues rather than distinct theoretical questions.

In social psychology, illusory correlations have reached a prominent status in stereotype research. A stereotype is commonly defined (cf. McCauley & Stitt, 1978 ; Rothbart & John, 1985 ; Stangor & McMillan, 1992 ) as an expected correlation–typically illusory but potentially veridical–between groups and some attribute (behavior, trait, or role). Given this analogous definition, stereotypes are special cases of expectancy-based illusory correlations.

In the social domain, expectations may not always originate in semantic or epistemic knowledge; rather, they may sometimes originate in affective goals or wishful thinking. Thus, perceivers who are themselves members of one group are typically biased to perceive correlations that assign more positive attributes to the in-group than to the out-group ( Brewer, 1979 ). A similar variant of wishful thinking can be found in self-perception. Alloy and Abramson (1979) presented their participants with a simple contingency game in which they could try to control the onset of a light by either pressing or not pressing a button. Normal participants (as opposed to depressives) typically overestimated the degree of control they exerted over the light onset, even when it was noncontingent. Note that the apparatus was free of any meaning, so epistemic expectations could hardly have influenced this variant of unrealistic optimism (cf. Weinstein, 1980 ).

Contribution to psychological theory.

In any case, the literature on illusory correlations is replete with provocative and partly ingenious demonstrations of expectation effects in social and applied contexts. However, the contribution of this prominent research field remains mainly empiristic. A growing body of evidence has been accumulated for an empirical law, namely, that prior expectations can–and almost universally do–intrude into the subjective assessment of stimulus correlations. The fascination of this phenomenon mainly arises from its power and robustness and the irrational flavor of people's inability to keep old knowledge and new data apart.

At the level of psychological theory, the impact of prior knowledge, the Kantian notion that human intelligence does not start as a tabula rasa, is simply taken for granted as an axiom or theoretical primitive. No explicit attempt is made to explain how and why it is that events that are expected to go together or resemble each other appear to be more frequent than unexpected or dissimilar pairings. Note that quite heterogeneous sources of expectation were used in the studies reviewed here, ranging from stereotypical beliefs to semantic similarity, biological preparedness, and wishful thinking. Note, in particular, that no systematic distinction was made between similarity-based and expectancy-based illusory correlations. No explicit process theory seems to be required for such a fully normal expectation effect. Implicitly, the interplay of inductive and deductive influences is assumed to follow a simple compromise: When observing the contingency between events, the intelligent organism does not start from zero but is already prepared with prior expectations rooted in older knowledge. To the extent that empirical stimulus data are incomplete, impoverished, masked, or forgotten as a consequence of imperfect memory, the resulting uncertainty gap can be filled with epistemic expectations, which afford useful default knowledge.

Within this plausible and seemingly uncontestable compromise model, there has been little controversy between competing theories, and reflections on the underlying cognitive process have been largely confined to locating the stages in the cognitive process that are sensitive to expectancies. Most experiments have emphasized the role of expectancy-driven encoding and selective recall ( Hamilton, 1981 ; C. Hoffman & Hurst, 1990 ), but the early stages of information search ( Klayman & Ha, 1987 ; Pyszczynski & Greenberg, 1987 ) and perception ( Fiedler, Hemmeter, & Hofmann, 1984 ) may contribute as well. The considerable counterevidence that exists for enhanced processing of expectancy-incongruent information ( Hastie, 1980 ; Srull & Wyer, 1989 ) has largely been ignored in this paradigm (as described subsequently).

Summary and leading theory heuristic.

Expectancy-based illusory correlations have been observed in hundreds of experiments, showing that correlation judgments under uncertainty (stimulus load and memory loss) reflect a compromise between actual observations and prior expectancies. Theoretical explanations have taken for granted that expectancy-congruent biases can occur at different stages of cognitive processing (information search, perception, encoding, recall, judgment, and communication), although stages may differ in sensitivity. Conflicting evidence for selective processing of incongruent information has rarely been related to illusory correlations (for a notable exception, see Garcia-Marques & Hamilton, 1996 ).

The notion of expectancy-congruent processing is so plausible and self-evident that it is presupposed as a theoretical primitive that need not itself be explained. Cacioppo, Gardner, and Berntson (1997) have recently identified this sense of obviousness as a major block in scientific progress. Note that the congruency concept, as a theory heuristic, is deeply rooted in a priori gestalt principles that hold not only for experimental participants but for researchers as well. Thus, if I expect that women are low in leadership ability, consistency theories ( Festinger, 1957 ; Read, Vanman, & Miller, 1997 ; Shultz & Lepper, 1996 ) will predict that pertinent observations should be interpreted in a manner congruent with this cognition. If the target of observation is female, the expectancy linking females to leadership inability will cause ambiguous behavioral observations to be interpreted accordingly (i.e., as weak, disorganized, hesitating, and low in charisma). Just as balanced structures can be learned more readily than unbalanced ones ( de Soto, 1960 ), the basic gestalt principle of congruency predicts that stimulus observations should be adjusted to preexisting expectancies.

Because the expectancy paradigm is mainly governed by the gestalt heuristic of congruency, it contributes little to cognitive theory. Neither the question of stimulus encoding and representation nor an algorithmic description of the assessment process is illuminated within this paradigm. It does not even afford an explicit model of the interaction between new stimuli and old expectancies.

Distinctiveness and Positive—Negative Asymmetry

Ironically, the obviousness of an expectancy bias may itself reflect an expectancy bias in the mind of researchers who continue to believe in congruency, although there is compelling counterevidence for an advantage of incongruent information in memory ( Hastie, 1980 ; Srull & Wyer, 1989 ; Stangor & McMillan, 1992 ). Incongruent, expectancy-deviant, surprising, or conflict-prone observations are particularly likely to be salient during perception, to be elaborated deeply at encoding, and therefore to be highly accessible in recall. Although such an incongruency effect is clearly at variance with a universal congruency principle, it is rooted in an equally common gestalt metaphor, the contrast of a figure against the ground. Salient, distinctive stimuli that deviate markedly from the background or baseline are likely to become the focus of attention and to give rise to illusory correlations distorted toward distinctive, salient, unexpected, and attention-grabbing stimuli.

Empirical evidence for this principle has developed in two paradigms. The first paradigm is concerned with the perceptual distinctiveness of positive as opposed to negative stimuli and their impact on learning and memory ( Shaklee & Mims, 1982 ; Shanks & Dickinson, 1987 ; Wasserman, Dorner, & Kao, 1990 ). In the other paradigm, distinctiveness refers to the contrast of outstanding stimuli against the remaining list. This has become a central issue in modern social psychology ( Fiedler, 1991 ; Hamilton & Gifford, 1976 ). It is addressed in the next section.

Review of historical origins and recent state of the art.

In an early investigation conducted by Jenkins and Ward (1965) , the task was to figure out the extent to which an outcome (a lighted circle symbolizing "success") could be controlled by pressing one of two buttons. When the actual contingency was zero, in that both buttons had the same success rate, participants experienced more control when the constant success rate was high (e.g., 75%) rather than low. This control illusion was obviously due to the fact that subjective contingency is mainly sensitive to the number of successful trials and rather insensitive to complementary feedback on negative trials.

This bias to attend to present features more than to absent features was also highlighted by Nisbett and Ross (1980) and confirmed by numerous studies conducted since Jenkins and Ward's seminal article (e.g., Crocker, 1982 ). For example, when judging therapy success, people usually consider the number of patients recovered after psychotherapy and fail to consider the rate of spontaneous recovery without psychotherapy ( Eysenck, 1956 ). Or when judging correlations between a symptom and a disease, they usually assign the greatest weight to the number of cases in which both the symptom and the disease are present ( Smedslund, 1963 ).

In a similar vein, judgments of the observed impact of a cause (e.g., plants receiving fertilizer) on an effect (blooming) are more sensitive to observations of the consequences of present causes and the antecedents of present effects than to observations of absent causes or missing effects ( Kao & Wasserman, 1993 ). Thus, judgments of causation do not generally follow Cheng and Novick's (1990) contrast model or the underlying delta rule, stating that causal judgments reflect the difference between the two conditional probabilities, effect/cause present and effect/cause absent. Instead, the bias toward positive information is evident in the so-called density bias ( Allan, 1993 ), showing that, when delta is held constant, judged causality or control increases with the absolute occurrence rate of the effect (e.g., A. G. Baker, Berbrier, & Vallee-Tourangeau, 1989 ; Shanks, 1985 ). Thus, even when an effect occurs at the same rate in the presence as in the absence of a cause, the judged contingency is higher when the constant rate is high (e.g., 75%) rather than low (25%).

On the basis of the empirical findings obtained in this paradigm (summarized in Table 2 ), a strong case can be made for positive—negative asymmetry (cf. Allan, 1993 ; Wasserman, Elek, Chatlosh, & Baker, 1993 ). Whenever one level of a dichotomous variable is more informative or diagnostic than the other level, contingency judgments will give more weight to observations representing the positive level.

Contribution to psychological theory.

As mentioned earlier, the asymmetric impact of positive versus negative information reflects a general gestalt principle: Present features (e.g., a traffic sign or an observed symptom) provide the figure before the ground. Absent features (an absent traffic sign or a missing symptom) are less informative because they do not reveal the nature of what is missing ( Garner, 1978 ).

However, research in this area has not made a systematic attempt to explain why, how, and under what boundary conditions positive features are more informative than negative features. Just as the expectancy effect described earlier, positive—negative asymmetry has been taken for granted as a plausible account of biased contingency assessment in this paradigm, covering diverse, intuitively chosen operationalizations of present versus absent features.

Rather than further pursuing the origins of perceptual salience, the theoretical emphasis was on testing covariation learning models, formulated as algebraic functions of the four frequencies a , b , c , and d in a standard 2 × 2 contingency table. In this notation (see Figure 1 ), a is the number of observations in which both attributes (e.g., cause and effect) are present, b represents the present—absent case, c represents the absent—present case, and d represents the absent—absent case. In spite of some ongoing debate about the specific integration rule and its moderators, there is wide agreement that a receives a much higher weight than d (reflecting the positivity bias) and relative agreement in the ordering a > b > c > d ( Kao & Wasserman, 1993 ).

Note that the prominent approach to model correlation assessment as a function of stimulus frequencies, or probabilities a / a + b and c / c + d , originates in a major concern with models of learning and conditioning (such as the delta rule) that are defined in terms of these statistics. Rather sophisticated methods and designs have been developed to test and quantify the impact of the four cell frequencies, based on the systematic variation of a , b , c , and d in the stimulus series ( Wasserman et al., 1990 ). However, whereas the empirically obtained weighting rules may afford "paramorphic" models ( P. J. Hoffman, 1960 ) of correlation judgments, they cannot be regarded as models of the cognitive process.

Summary and leading theory heuristic.

The extra distinctiveness of positive observations, which constitutes the major theory heuristic of this paradigm, can be considered a well-established law of inductive learning. However, aside from their implications for general learning models, these findings reveal little about the cognitive representation of positive and negative information or the underlying memory algorithm. Moreover, the impact of expectations and content specificity, which is central to the former paradigm, is largely excluded from this paradigm in which prior knowledge is typically ruled out as a factor to be controlled experimentally.

Distinctiveness and Infrequency

The power of distinctive, attention-grabbing events in regard to producing illusory correlations is not confined to the perceptual salience of present (positive) as opposed to absent (negative) features. Distinctiveness may also arise from the infrequency of outstanding stimuli within a list. This variant dates back to the famous von Restorff (1933) effect. The typical task used by Hedwig von Restorff, a student of Koehler in Berlin, consisted of a series of numbers in which, say, one letter string was inserted, or pairs of nonsense syllables with singular pairs of other materials in between. In subsequent memory tests, the outstanding stimuli were shown to have a clear memory advantage.

Review of historical origins and recent state of the art.

Although the von Restorff effect is rarely cited explicitly, it had a huge impact on empirical approaches to illusory correlation. L. J. Chapman (1967) showed that the frequency of outstanding pairs of stimulus events (e.g., lion—tiger) was overestimated relative to less distinctive pairs (e.g., bacon—tiger). In a well-known experiment conducted by Taylor, Fiske, Etkoff, and Ruderman (1978) , participants observed a videotaped group discussion in which one Black (White) and five White (Black) individuals took part or one woman (man) and five men (women) took part. The relative contribution of the less frequent type of discussant was regularly overestimated. Likewise, McArthur (1980) reviewed research showing that salient stimulus persons are given more attention and are perceived to exert more "social causality" than less salient persons.

By far the most important elaboration of the von Restorff phenomenon took place in social psychology, following the seminal work of Hamilton and Gifford (1976) . Their stimulus series consisted of 26 behavior descriptions pertaining to Group A (the majority) and 13 behavior descriptions pertaining to Group B (the minority). The correlation between group membership and the desirability (positivity) of behaviors was zero in that the same ratio of positive to negative behaviors held for both groups (i.e., 18+:8- for Group A and 9+:4- for Group B). As it turned out, however, the larger group was consistently associated with the predominant valence of behavior (i.e., Group A appeared more positive when positive behaviors were more frequent), whereas the smaller group was more associated with the valence of the less frequent behavior (i.e., negativity). This was apparent in different dependent measures, such as frequency estimates, trait impression ratings, and cued-recall tests of group—behavior associations. This noteworthy phenomenon has been replicated, validated, and extended in numerous studies (see Mullen & Johnson, 1990 ).

The enormous popularity of the Hamilton—Gifford paradigm is mainly due to the social psychological challenge it conveys. Minorities are, by definition, less numerous than majorities, and negative behavior is norm deviant and therefore less frequent than positive, norm-conforming behavior ( Taylor, 1991 ). Thus, the Hamilton—Gifford paradigm provides an analog of the stimulus environment that characterizes real minorities. Given the same ratio of (prevailing) positivity in large and small groups, the impressions and cognitive representations of minorities will be relatively negative. This raises a pessimistic perspective on the problem of minority discrimination. A synopsis of relevant literature is provided in Table 3 .

Contribution to psychological theory.

Adopting von Restorff's gestalt notion of distinctiveness, Hamilton and colleagues (cf. Hamilton & Sherman, 1989 ) reasoned that the joint infrequency of small groups and rare behaviors renders negative minority behavior particularly distinctive and salient. As a consequence, stimulus items that belong to the most distinctive category should be encoded more deeply and should therefore have a memory advantage, which is assumed to mediate the resulting illusory correlation.

Over many years, no theoretical alternative was apparent to this account, which was regarded as empirically well established. Only in the last few years have new findings and computer simulations provided alternative explanations of the Hamilton—Gifford phenomenon without the assumption of a memory advantage for infrequent events ( Fiedler, 1991 ; Smith, 1991 ). Crucial to these accounts is sample size; if positive behavior prevails by the same ratio in two groups, this prevalence will be more apparent in the larger group because of the larger number of observations or learning trials. Theoretically, this approach places illusory correlations in the context of a basic law of learning, namely, that a constant "reinforcement ratio" (i.e., ratio of positive behaviors of a group) is learned more effectively as the number of learning trials increases. It is not necessary to assume changes in the learning parameters (i.e., a memory advantage for particular events).

Summary and leading theory heuristic.

Almost all research in this paradigm is related to judgments of majority versus minority groups. Applications of the major theory heuristic, distinctiveness, to formally analogous stimulus distributions in other content areas have been rare (see Table 3 ). In the absence of a comprehensive cognitive model, the latent conflict with the expectancy paradigm (which predicts a stronger influence of expected rather than distinctive information) has not been addressed explicitly. Nor has the relation to the second paradigm, the perceptual asymmetry of present versus absent features, been delineated systematically. With regard to the cognitive underpinnings, modern research tools of cognitive psychology have rarely been applied to substantiate the supposed memory advantage for infrequent observations (as described later). Cognitive boundary conditions are only crudely apparent in the general finding that infrequency-based illusory correlations are most pronounced under suboptimal encoding conditions. Thus, the illusion increases when memory load is high ( McConnell, Sherman, & Hamilton, 1994b ), learning is incidental rather than intentional ( Pryor, 1986 ), and stimuli refer to groups rather than individuals ( McConnell, Leibold, & Sherman, 1997 ; see meta-analysis by Mullen & Johnson, 1990 ).

Theoretical Reflection

Although the two gestalt notions of congruency and distinctiveness are among the most prominent building blocks of psychological theory formation ( Heider, 1958 ; Hunt, 1995 ; Kunda & Thagard, 1996 ; von Restorff, 1933 ), the theoretical foundation they provide for an explanation of illusory correlations is less than satisfactory. It entails the danger of being circular on the one hand and contradictory or incoherent on the other.

Circularity is present when a judgment bias toward expected correlations (e.g., strong male leaders) is explained by enhanced weight given to expected information, or when a bias in favor of salient, unexpected information (e.g., strong female leaders) is explained by a tendency for enhanced processing of unexpected observations. Moreover, the contradiction between the two gestalt notions of congruency and distinctiveness cannot be discarded as merely rhetorical. Congruency implies enhanced weight given to expected information, whereas distinctiveness gives superiority to unexpected, outstanding information. Thus, a priest's benevolent behavior should override his criminal behavior (congruency), but a priest's criminality should be particularly attention grabbing, leading to an opposite judgment bias.

From a metatheoretical view, it seems fair to characterize the situation as follows. Depending on what outcome is obtained in one particular context (i.e., a bias toward expected or unexpected behavior), theories rely on the congruency metaphor or on the distinctiveness metaphor. This post hoc feature, or hindsight reasoning, creates theoretical dissatisfaction. What is strongly needed is a comprehensive theoretical framework that allows for the forward prediction of illusory correlations, from antecedent conditions to observed consequences.

Stangor and McMillan (1992) offered an elegant solution to this problem in an elucidating meta-analysis. On the basis of signal-detection analyses, these authors drew a systematic distinction between genuine memory of original stimuli and guessing inferences involving prior knowledge. When the experimental task relies heavily on memory for original stimuli, the deeper encoding of incongruent information determines the outcome. However, when the task invites top-down inferences from prior knowledge structures, an advantage for expectancy-congruency information becomes apparent.

This solution relies on process dissociation. There might be no conflict or even contradiction at all between congruence and distinctiveness (see also Garcia-Marques & Hamilton, 1996 ) if both principles apply under completely different conditions: The congruency or expectancy bias might operate directly on covariation judgments, independently of the memory representation and recall of stimuli, which profit from the encoding advantage of incongruency. However, this peaceful solution of the conflict is more apparent than real. In fact, there is evidence that expectancies affect not only final judgments but also the perception, encoding, and disambiguation of stimuli from the beginning ( Fiedler et al., 1984 ). In the preceding example, the priest's behavior is more likely to be perceived, classified, and encoded as benevolent than mean (cf. Trope & Liberman, 1993 ). Moreover, these congruency effects on memory are correlated with later judgment biases. Likewise, the distinctiveness bias (at least in the infrequency paradigm) is not merely a matter of memory for original stimuli but reflects a substantial judgment bias ( Klauer & Meiser, 1998 ). Thus, the conflict does exist and cannot be discarded beforehand as belonging to two genuinely separate situations.

What are the properties that one would expect of a unifying theoretical framework? Three indispensable criteria are that the framework should indicate an algorithm that is valid, fertile, and noncircular. In other words, it should (a) account for the empirical evidence, (b) lead to new predictions, and (c) be precise in explicating the processes rather than relying solely on verbal paraphrases of the phenomena to be explained.

A Simple Distributive Learning Model

A simple learning model is now depicted that has the potential to meet all of the preceding criteria. It can account for expectancy-based and distinctiveness-based illusory correlations within the same framework. The model deviates from traditional accounts in several nontrivial predictions. In addition to illusory correlations reflecting biased cognitive processes, it predicts similar illusions when no biased processes are involved, thus overcoming the status of a theory that has to assume one bias to explain another. Moreover, the model helps to avoid conceptual confusion and to isolate variants of illusory correlations. And, finally, it offers the power and precision of a theory that is explicated as a transparent computer algorithm amenable to everyone rather than dependent on interpretations of a few privileged theorists ( Read et al., 1997 ; Smith, 1996 ).

The model to be proposed is derived from the BIAS framework ( Fiedler, 1996 ) that explains judgmental biases as a consequence of simple rules of associative learning in a probabilistic environment. The Brunswikian premise, or starting assumption, is that most meaningful correlation tasks refer to distal entities (e.g., health, leadership, danger, or femininity). This premise has crucial implications for the nature of the stimulus input and its cognitive representation. Because the distal variables cannot be perceived directly, the meaning of stimulus observations (e.g., a woman's leadership ability) has to be inferred from vectors of multiple probabilistic cues. For example, the cues that mediate the "perception" of leadership ability may include status symbols, formal dress, strong voice, and upright posture (see Figure 2 ).

It is typical of such distal perception that singular cues have rather modest diagnosticities; only the concert of multiple probabilistic cues warrants valid perception. Neither strong voice nor upright posture, nor any other cue alone, would afford a reliable index of leadership. However, misleading information in some cues can be compensated by other cues so that, over multiple cues, the distal variable can be inferred with reasonable validity. As a result of this imperfect, flexible relationship between the distal variable and its proximal cue indicators, the same value on a distal variable can appear in many different patterns or configurations. Thus, leadership behavior (even of the same person) does not always appear as the same pattern of voice, posture, facial expression, and so forth (just as lies, danger, attractiveness, and many other distal concepts are manifested in diverse cue patterns).

An essential implication of the Brunswikian approach is that the basic format of a stimulus is not a singular scalar (or scale value amenable to direct perception) but a distributive pattern of information (a vector of multiple cue values). Figure 2 illustrates how one can think of the generation of distributive stimulus patterns. Let the column vector l _ideal denote the ideal type of a distal entity indicating, say, maximal leadership ability. "Ideal" refers to the errorless pattern in which all cue values point to high leadership, without exception (i.e., status symbols present [+], politeness absent [-], etc). Note that the ideal pattern l _ideal is a theoretical construct; actual observations of leadership behavior are "noisy copies" that do not match the ideal in all cue values. Twelve such observations (e.g., 12 encounters with the same politician) are depicted in the middle of Figure 2 , with each column vector of the matrix representing one stimulus ( l ₁ , l ₂ , . . . l ₁₂ ). Each of these stimulus patterns was generated by randomly inverting three of the nine cue values defining ideal leadership. Thus, although the ideal meaning of leadership is not apparent from each observation, the majority of cues point to leadership. Therefore, the resemblance of each stimulus to the distal entity is clearly above chance; the correlations between the stimulus patterns and the ideal pattern l are in the range of .32 to .48. The extent to which stimulus patterns resemble the distal entity depends on the noise parameter i , the proportion of features inverted at random (e.g., if three of nine cues are inverted, i = .33). Psychologically, i reflects the amount of information that is lost, either in the ecology (as a result of imperfect cue validity) or in the cognitive system (as a result of inaccurate perception and memory loss).

Inductive judgments involve, by definition, the collection of multiple observations; just as the basic format of a stimulus is a vector (not a scalar), the entire stimulus series yields a matrix of stimulus vectors (as in Figure 2 ). Forming an inductive judgment of the entire stimulus series can be conceived as an aggregate over all matrix columns; in the simplest case, the aggregate could be the horizontal, featurewise sum of all "+" and "-" values, as shown on the right of Figure 2 . Crucial to understanding the simulation model is that the aggregate pattern resembles the ideal pattern from which all stimuli were generated more strongly ( r = .78) than the individual stimuli (i.e., repeated encounters will clearly reveal the politician's leadership qualities).

This reflects a most essential and consequential property of distributive stimulus representations arising from noisy multicue environments: Through aggregation over an increasing number of stimuli, the noise or error variance in the stimulus matrix is canceled out, and the systematic variance component (reflecting the distal variable) becomes increasingly visible. A single observation of an individual high in leadership ability may not reveal this latent trait. However, as the sample of observations increases, the aggregation process will make the strong leadership style more and more apparent, just as with an image on film that is gradually illuminated. (In a similar vein, aggregation over several observations is necessary to extract someone's attitude from multiple remarks in a discussion, to determine someone's intelligence from multiple items of a test, or to figure out someone's dishonesty, humor, or other distal attributes.)

Before the BIAS model is explained further and applied to illusory correlations, a few comments are in order regarding the problem of symbol grounding and parameter settings. First, the model is not restricted to binary cue values (plus vs. minus, as in Figure 2 ) but easily extends to cues that vary continuously on a quantitative scale. Second, the particular cues used for illustration were selected for convenience and can be substituted by many other cue sets. It is essential for Brunswik's ( 1955 , 1956 ) cognitive—ecological approach that an adaptive organism changes and substitutes cues in a highly flexible fashion (e.g., depth perception during the day vs. at night). All of the implications derived from the model are purely structural and independent of the particular cues used for illustration. Third, the quantitative predictions will of course depend on several parametric decisions concerning the specific aggregation function (e.g., sum or weighted average), specific similarity measures (correlation or otherwise), number of cues, and so forth. However, the important qualitative results generalize over many reasonable aggregation functions and parameter values.

Finally, it is important to recognize that BIAS entails little restriction concerning the temporal course of the cognitive process. Thus, the multiple columns of the stimulus matrix do not necessarily imply a multitrace model ( Hintzman, 1988 ; Smith, 1991 ) in which individual stimulus exemplars are conserved until a final judgment is based on these original entries. Rather, aggregation may occur during encoding as well, or bunches of stimuli will be aggregated to higher order chunks during encoding, and the final judgment will transform these medium-sized aggregates into an overall aggregate. Whatever the temporal order and segmentation of these aggregation functions, the central implications remain the same.

Applying BIAS to Correlation Assessment Tasks

The same basic prompting and aggregation process that has been outlined so far can now be applied to illusory correlations. The BIAS algorithm can simulate illusory correlations in two fundamentally different ways. On the one hand, traditional interpretations of the illusion can be simulated by ad hoc parameter settings reflecting the selective processing of salient, expected, or desired categories. Simulation of these cases is clearly due to those additional assumptions (e.g., increased weight given to expected stimuli) rather than intrinsic properties of BIAS. On the other hand, and more intriguing, the BIAS algorithm alone produces at least one variant of each type of illusory correlation without ad hoc parameters. Whereas the former demonstrations refer to illusions due to biased processes, the latter simulations highlight that no biased processes are needed to explain biased outcomes.

BIAS can even predict variants that have not yet been discovered for empirical research. The reported simulations begin with the unbiased case of accurate correlation assessment, to further illustrate and validate the algorithm, and then address illusory correlations based on genuine expectancy biases, similarity of meaning, and various types of distinctiveness. Simulation studies are discussed in terms of their cognitive analogs, and relevant empirical evidence is mentioned.

Modeling the correlation between two variables (e.g., leadership and health) requires that the stimulus vectors include cues that speak to both distal entities. Two subsets of cues, or segments, are thus distinguished in the left part of Figure 3 . The upper segment contains cue information about health, in this case, ideal patterns describing high ( h ₊ ) and low ( h _- ) health. The lower segment contains information about high ( l ₊ ) or low ( l _- ) leadership (note that the l ₊ pattern is identical to the one in Figure 2 ). To simulate the correlation between health and leadership (conceived as dichotomous variables), a series of stimulus patterns must be generated that represent the four event combinations (2 health levels [H+ and H-] × 2 leadership levels [L+ vs. L-]) with particular frequencies. In Figure 3 , the stimulus distribution reflects a positive correlation; there are 12 H+L+ cases (Cell A), 6 H+L- cases (Cell B), 6 H-L+ cases (Cell C), and 12 H-L- cases (Cell D). Each stimulus pattern is a noisy copy of the corresponding combination of ideal patterns ( h ₊ vs. h _- combined with l ₊ vs. l _- ), assuming a noise ratio of i = .3; that is, 30% of all cue values have been inverted at random.

One way to operationalize correlation judgment is to compare the differential leadership impressions of healthy and nonhealthy people (analogous to the so-called delta rule; Allan, 1993 ). Accordingly, BIAS computes a leadership impression (aggregate) associated with healthy people and a leadership aggregate associated with unhealthy people; the difference between these two impressions (i.e., of leadership given a high vs. low level of health) affords a measure of the simulated correlation. First consider the way in which BIAS arrives at a leadership impression judgment of healthy people. The task instruction to judge healthy people corresponds to using a prompt vector p _h+ that has the ideal type of health, h ₊ , in the upper (health) segment and zero or missing values in the lower (leadership) segment. This reflects a situation in which the prompt to judge healthy people is given and the leadership slot is open to be judged. This judgment prompt, p _h+ , is compared with all stimulus patterns in the matrix, and the dot product (number of matches minus mismatches) is used as a measure of the degree to which each stimulus is relevant to the prompted judgment domain (healthy). Each stimulus pattern is then weighted by this dot product; that is, patterns are multiplied by their similarity to the prompt. The aggregate (horizontal sum) of these weighted patterns produces an aggregate a _h+ (not shown in Figure 3 , analogous to the aggregate on the right of Figure 2 ). The leadership segment of this aggregate provides the leadership impression of healthy people. In the same way, the ideal type of nonhealth can be used as a prompt ( p _h- ) to generate an aggregate a _h- of unhealthy people, with the leadership impression in the respective segment.

As shown at the bottom of Figure 3 , the aggregate a _h+ computed from the healthy prompt, p _h+ , correlates more strongly with the ideal pattern l ₊ for maximal leadership ability (in the leadership segment) than the aggregate a _h- computed from the unhealthy prompt, p _h- ( r = .52 vs. r = .26), thus correctly simulating a positive relation between health and leadership. This illustrates the general algorithm by which the BIAS model can assess the correlation between variables. The cue patterns representing high and low levels on one variable are used as prompts, and the resulting aggregates are compared in terms of their resemblance to a pattern denoting a high value on the other, correlated variable. More precisely, the simulated correlation is the difference (D) r ( a _h+ , l ₊ ) - r ( a _h- , l ₊ ), in which the correlation coefficient ( r ) is used as a measure of similarity within the leadership segment.

Simulating Illusory Correlations

Having demonstrated how the model works and how it applies to judgments of actual correlations, I now address the crucial issue of illusory correlations. Cognitive process assumptions are translated into the BIAS algorithm. The transparency gained from explicating these assumptions within the BIAS framework is sufficient to make analytical distinctions between variants of illusory correlations that are normally confused. Simulations were run to demonstrate virtually all types of the phenomenon, some of which are the product of fully unbiased information processes.

Because the purpose was to simulate illusory rather than veridical correlations, the simulated frequency distributions were deliberately chosen to represent zero correlations. Within this restriction, both skewed and unskewed frequency distributions were used, on the basis of either equal cell frequencies, a = b = c = d = 10, or unequal cell frequencies, a = 20, b = 10, c = 10, d = 5 (see notation in Figure 1 ). In either case, the correlation was zero, because the same frequency ratio holds for both rows [a/(a + b) = c/(c + d)] and columns [a/(a + c) = b/(b + d)].

Each simulation began from a randomly chosen ideal pattern representing positive levels on both variables, say, h ₊ for a high level of health and l ₊ for high leadership ability. Nine binary cues were used to characterize these two ideal patterns. Ideal patterns for low or negative levels of each variable, h _- and l _- , were then constructed by randomly inverting five of the nine cue values of the positive antonym. In a distributive memory model, antonyms are not represented as perfectly inverse patterns, because both poles (e.g., high and low leadership) share many features pertaining to the semantic meaning of the joint dimension (e.g., both poles involve leadership). Again, this assumption is not essential for the reported results. The combination of h ₊ , h _- and l ₊ , l _- yields ideal patterns for the four stimulus categories (i.e., Cells A, B, C, and D occurring with frequencies a , b , c , and d ; see Figure 3 ).

Stimulus vectors were generated by copying the corresponding ideal type and randomly inverting a proportion i of all values. As already noted, this noise factor may reflect various sources of information loss, such as imperfect cue validity, unreliable perception, or memory decay. (Memory decay may be more adequately modeled by cue values set to zero; for simplicity, however, only inversions were used.) The number of stimuli generated from each ideal type corresponded to the cell frequencies a , b , c , and d , according to the simulated distribution.

High and low levels of one variable were chosen for convenience as prompts ( p _h+ and p _h- ), and the high level of the other variable, l ₊ , was used as a criterion for evaluating the respective aggregates, a _h+ and a _h- . Finally, the simulated correlation between health and leadership was computed as follows: D = r ( a _h+ , l ₊ ) - r ( a _h- , l ₊ ). This correlation is positive (negative) if the difference is positive (negative). The whole procedure was repeated 100 times, for 100 simulated "subjects"; new random patterns and newly generated stimulus matrices were used in each run.

Within this general frame, the various sources of illusory correlations could be modeled. Table 4 provides a summary of distinct process assumptions on which the simulations were based. Implementing these cognitive process assumptions involved the following procedures.

Genuine expectancy effects.

Given the expectation that high (low) levels of health and leadership coincide, postulating an expectancy bias amounts to assuming a processing advantage of Cells A and D (observations of high leadership ability in healthy people and low leadership ability in unhealthy people) over Cells B and C. A systematic attempt to explicate this standard assumption in the literature on so-called expectancy-based illusory correlations within the BIAS algorithm leads to the distinction of (at least) three different processes that can all explain the dominance of expected over unexpected information (see Table 4 ). All three types represent biased processes reflected in additional parameter settings.

The first possibility is simply that people confuse actual stimulus data with older knowledge, stemming from previous observations or secondhand information about expected correlations. Thus, the a priori cell frequencies, before stimulus presentation, would not be zero; rather, they would reflect the prior expectancy that a + d > b + c . This case simply refers to appending additional column vectors for Cells A and D, reflecting self-generated stimulus expectancies in addition to actual observations.

Alternatively, the bias may result from selective memory favoring expected information. Thus, even when judges clearly understand the task to judge the correlation in the stimulus data and they do not confuse the data with older knowledge, they may process information about A and D more efficiently than information about B and C. This case can be simulated by setting the noise parameter i lower for Cells A and D than for Cells B and C. This could reflect enhanced attention, perception, encoding, or storage or reduced forgetting of expected information.

A third possibility is that there is neither source confusion nor selective memory for Cells A, B, C, and D, but the cognitive integration rule assigns higher weight to expected than unexpected stimuli (e.g., as a result of perceived validity or confidence). This refers to amplifying the weight given to stimulus vectors for Cells A and D relative to Cells B and C in the aggregation process. Because all of these process assumptions have a similar effect–namely, to overrepresent expected information in Cells A and D–only the latter case was simulated (i.e., double weighting of expected items). However, the other variants of an expectancy-based process can also be simulated easily.

Semantic similarity.

As shown in the empirical review section, illusory correlations based on semantic similarity are treated in a manner synonymous to expectation effects, the implicit assumption being that semantic meaning is a major source of expectations. Within the BIAS framework, it is apparent that both types of illusory correlation have to be distinguished for analytical reasons. An illusion based on semantic similarity may occur in the absence of any expectation about the sign and strength of the covariation and without any biased processes, merely as a reflection of the overlap in the semantic features that characterize correlated variables. Thus, even when observers have to learn a fully new correlation, with all prior expectancies eliminated, BIAS alone will produce a correlation if there is only some overlap in the cue patterns mediating the assessment of the two correlated variables.

BIAS offers a concrete explanation for this abstract theoretical statement. Consistent with leading approaches to similarity ( Tversky, 1977 ), BIAS defines similarity in terms of feature overlap. Accordingly, the similarity between two attributes (such as health and leadership) depends on the number of common perceptual features shared by these attributes. Within BIAS, cue overlap affords a straightforward way to operationalize this notion of similarity. For instance, some of the cues (e.g., strong voice, upright position, and no warm expression) that have been assumed to represent high leadership ability ( Figure 2 ) are also indicative of strong health. This meaning overlap, which reflects semiotic confusion rather than any expectancies about the distribution of events, was simulated by overlapping vector segments (see Figure 4 ). The segments for the two attributes are not clearly separated; rather, they overlap in the middle portion. In addition to nine pure health cues and nine pure leadership cues, the simulation to be reported assumed four overlap cues to represent the similar meaning of health and leadership. (Again, the chosen degree of overlap affects the strength but not the quality of the simulated results.) However, importantly, no ad hoc assumption on biased processing of particular events was added.

In summary, the analytical clarity of an algorithmic approach serves to refine and specify the notion of expectancy-based illusory correlations. Although commonly treated as a homogeneous phenomenon, expectancy effects can originate in such fundamentally different cognitive processes as source confusion (of observed and self-generated items), selective processing (of Cells A and D), and enhanced weighting of expected information during the final integration process. Even more important, from a theoretical point of view, is the insight that similarity-based and expectancy-based illusory correlations must not be confused, for analytical reasons. The semiotic mechanism underlying similarity effects can occur in the absence of any prior expectancy or belief, merely as a consequence of the confounding of the cues that mediate the assessment of distal attributes.

Distinctiveness through asymmetry.

The pervasive tendency to give unequal weight to information in different cells of a contingency table (see evidence in Table 2 ) reflects the enhanced informativeness or distinctiveness of positive attribute levels (e.g., about the presence of symptoms) as compared with the rather pallid and less distinctive appearance of negative attribute levels (e.g., absent symptoms). This basic asymmetry in the learning of positive versus negative information, often referred to as a feature-positive effect ( Jenkins & Sainsbury, 1970 ; Newman, Wolff, & Hearst, 1980 ), constitutes the crucial cognitive assumption explaining why Cell A (representing positive levels on both variables) normally receives the highest weight and why cell weights are ordered A > B > C > D.

This type of distinctiveness, based on the figure—ground asymmetry of positive and negative attribute levels, was simulated by different degrees of resemblance of stimuli to the respective ideal patterns. Let h ₊ and l ₊ be the ideals of positive values on the health and leadership dimensions, respectively. The negative ideals, h _- and l _- , are then defined by a pattern that is inverted in six of the nine cues (rather than perfect inversion of all nine cues). Asymmetry of positive and negative information is then created as follows. Positive attribute levels are generated to be more diagnostic in that they resemble h ₊ and l ₊ in more features (setting, i ₊ = .33) than negative attribute levels resemble h _- and l _- ( i _- = .44). In this way, positive and negative information is encoded into differentially informative cue patterns.

A more radical way to simulate the asymmetry of present and absent attributes would be to start from a list of present features to represent positive variable levels and to add only a single cue, for negation, to represent a negative variable level. In this case, the asymmetry is even more apparent. However, in the present simulations, this case was ignored, and only the more subtle case was chosen in which positive information is encoded into slightly more distinctive cue patterns than negative information ( i ₊ = .33 vs. i _- = .44). Note that although this assumption does not in itself entail a correlation bias, the BIAS algorithm alone can simulate illusory correlations under this condition.

Distinctiveness through infrequency.

The Hamilton—Gifford or von Restorff type of distinctiveness-based illusory correlation ( Table 3 ) rests on the cognitive assumption that the least frequent attribute combination is most salient and encoded at the deepest level, thus resulting in a memory advantage for rare events. This assumption is often used to explain illusory correlations obtained with skewed distributions in which absolute frequencies differ ( a = 20, b = 10, c = 10, d = 5) but the correlation is zero [i.e., a /( a + b ) = c /( c + d ) = 20/(20 + 10) = 10/(10 + 5)]. If the extra distinctiveness of the five cases in Cell D (e.g., the five cases of unhealthy nonleaders) represents a memory advantage, the subjective correlation should rise above zero.

Within the BIAS model, the suggested memory advantage of Cell D can be specified to mean that little Cell D information is forgotten; this amounts to assuming a reduced noise parameter i for Cell D. Alternatively, one might assume that Cell D data receive an extra weight (as a result of increased confidence or deeper encoding). Because these two assumptions lead to similar effects, only the case of enhanced weights was simulated. In any case, this simulation represents a biased process, as evident in the ad hoc parameter set for Cell D.

Notably, the BIAS algorithm isolates one type of infrequency effect that is not mediated by distinctiveness at all. BIAS predicts an illusory correlation in the absence of any enhanced memory for Cell D, simply because a /( a + b ) = 20/(20 + 10) is psychologically more "significant" than c /( c + d ) = 10/(10 + 5). That is, a large sample of 20 leaders among 30 healthy people is worth more than 10 leaders among 15 unhealthy people, simply as a consequence of the unequal sample size. Simulations demonstrated that no increased weight or reduced i has to be assumed for Cell D; different sample sizes alone will produce illusory correlations. Thus, as for the other two classes of illusory correlations, at least one variant of infrequency-based illusions arises from the associative algorithm alone, in the absence of biased processes.

Simulation Results for Specific Illusory Correlation Effects

Simulation studies were not conducted for all combinations of the preceding assumptions (Frequency Distributions × Types of Expectancy Effects × Degree of Overlap × Distinctiveness Types); rather, they were conducted only for specific conditions corresponding to real research paradigms or particularly interesting cases. The simulated conditions and results (all based on 100 simulated "subjects") are given in Table 5 .

Accurately assessed zero correlations.

The first row of Table 5 shows that BIAS correctly predicts zero correlations when the actual correlation is zero and there is no skew, no asymmetry of attribute levels, no selective forgetting, and no selective weighting. In this case, the aggregates a _h+ and a _h- for healthy and unhealthy people were associated to a similar degree with the criterion of high leadership ability l ₊ ( r = .44 vs. r = .40).

Genuine expectancy effects.

When all other parameters remained unchanged and only the assumption of selective processing of Cells A and D was introduced (i.e., double weighting for A and D), the mean simulated correlation was substantial (.60 vs. .05). This case reflects a genuine expectancy effect. Of course, a very similar effect would result from reduced forgetting of information from Cells A and D (defined in terms of reduced i ).

Similarity based on cue overlap.

Similarity-based illusory correlations may be independent of expectancies. In the absence of any selective processing, an artificial correlation may be perceived because the respective cue sets are confounded. Note that in this simulation, all items from all cells were processed with the same weight and accuracy, and no asymmetry or skewed distribution was involved. The extremely strong illusory correlation (.86 vs. .39) reflects the high overlap proportion (through four common cues, with nine specific cues for h and l ) and suggests that much weaker overlap is sufficient for the illusion.

Asymmetry effects.

As a means of simulating the pure asymmetry version of illusory correlations, all biases were avoided, and separable (nonoverlapping) nine-cue segments were used to represent both variables. Asymmetry was introduced merely by making all negative stimuli somewhat less diagnostic ( i _- = .44) than positive stimuli ( i ₊ = .33). This slight asymmetry in diagnosticity of positive and negative attribute levels caused illusory correlations ( r = .45 vs. r = .31).

Cell D distinctiveness.

Illusory correlations based on skewed stimulus distributions are often explained in terms of an alleged memory advantage for the most infrequent information in Cell D, thought to be particularly distinctive. As Table 5 confirms, the joint operation of skewed distributions and a Cell D advantage (reduced i of .17) led to a strong effect ( r = .58 vs. r = .18).

Pure aggregation effect.

When the bias in favor of Cell D is removed, the skewed distribution can produce a significant correlation (.67 vs. .36) without any distinctiveness effect. This is merely due to aggregation from different sample sizes without any processing bias.

Distinctiveness of frequent events.

To highlight the conceptual independence of distinctiveness and skewed distributions, the final simulation in Table 5 maintained the different sample sizes (giving rise to positive correlation judgments) but increased the distinctiveness (double weighting) of Cell C information (implying negative correlation). The net effect was a negative correlation (.38 vs. .71). However, this effect was weaker than the impact of Cell D distinctiveness because of the opposing trend of a positive correlation due to the skewed distribution.

Summary.

In conclusion, the simple associative rule underlying the BIAS algorithm can be used to model all "classical" types of illusory correlations within the same basic framework. Regardless of whether expectancy-based illusory correlations are simulated by selective weighting or forgetting, whether distinctiveness is introduced by asymmetric attribute levels or infrequent event classes, or whether distinctiveness is introduced as enhanced salience or enhanced memory, BIAS can simulate the illusions. Moreover, the potentially inverse influences of expectancies and distinctiveness can be located in different facets of the same cognitive process that becomes clear and transparent in the computer model.

Most important, the simulated illusions extend beyond the familiar notions of congruency and distinctiveness, showing that illusory correlations are obtained in the absence of any expectancies, merely as a result of cue overlap, and independently of distinctiveness, merely as a consequence of unequal sample sizes. All three traditional classes of illusory correlations (see Tables 1 — 3 ) could be simulated when biased-process assumptions were built into the algorithm via arbitrary parameters. However, more originally from a theoretical point of view, for each class of phenomena at least one simulation was successful without any biased processes, as a natural consequence of the BIAS algorithm. This pertains to similarity-based illusory correlations due to feature overlap, to the asymmetry of positive and negative attribute levels, and to infrequency-based illusory correlations due to sample size alone.

Implications and Insights Gained From the BIAS Simulations

To the extent that the multicue assumption of BIAS applies to real information processing, the present simulations have rather challenging implications. They reveal a number of ways in which common theoretical explanations of illusory correlations, based on the two gestalt notions of congruency and distinctiveness, must be refined, extended, corrected, and tested in future research.

The literature on expectancy-based illusory correlations, to begin with, has failed to distinguish between two fundamentally different cognitive processes, only one of which is driven by expectancies. Whenever participants' prior expectancies have not been controlled directly, but the crucial independent variable has relied on semantic similarity of attribute meanings, the alleged expectancy effect may actually reflect semiotic confusion effects. As the BIAS algorithm elucidates, the tendency to report correlations between semantically similar attributes (e.g., leadership and health) may be due not to any processing advantage or higher weight given to expectancy-congruent stimuli but to overlapping cue sets contaminating the assessment of similar attributes. Even when covariations are learned among completely new stimuli, ruling out any prior expectancies, the similarity of stimulus features can lead to illusory contingencies.

Of course, the BIAS model can assimilate genuine expectancy effects as well, giving a processing advantage to expected stimulus events (in Cells A and D). However, a serious problem regarding the interpretation and validity of the entire literature on so-called expectancy biases remains, because few attempts have been made to set expectancy effects apart from similarity effects. In operational terms, validity checks have not separated individual judges' prior expectancies and the cues mediating the similarity of attributes.

How likely and how plausible is an alternative account of expectancy effects in terms of semiotic diffusion? A vast body of evidence on learning, memory, and cognition converges in the conclusion that, if anything, unexpected, surprising, or script-inconsistent information is elaborated more deeply and recalled better than expected or schema-congruent information ( Cheng, 1997 ; Hastie, 1980 ; Rescorla & Wagner, 1972 ; Stangor & McMillan, 1992 ). Thus, the available evidence suggests that any processing bias (in terms of effective encoding or reduced information loss, as measured by i ) facilitates memory for unexpected rather than expected information.

One might conjecture that correlation judgments are driven not by memory for individuating stimulus information but by a top-down process that is mostly sensitive to prior knowledge. And, indeed, when the experimental task setting encourages guessing based on expectations rather than recall of individual stimuli, expected information may override unexpected data (see Garcia-Marques & Hamilton, 1996 ; Heit, 1993 ; Stangor & McMillan, 1992 ). In BIAS, this would correspond to a case in which initial stimulus processing is unbiased ( i constant) and an extra weight of expected information is introduced in the final integration process. However, then the unresolved question remains as to the origin of the expectations that influence final judgment or guessing. One possible answer is, again, semantic similarity.

Conversely, much can be said in favor of a cue-overlap account, in accordance with a feature approach to similarity ( Tversky, 1977 ). After all, the crucial independent variable in most experiments on expectancy-based illusory correlations is the semantic relatedness of attribute names ( L. J. Chapman & Chapman, 1967 ; Hamilton & Rose, 1980 ; Miller, 1971 ) rather than direct manipulations or measures of expectancies (i.e., subjective likelihood of anticipated stimulus events). A prominent task for future research is to differentiate between genuine expectancy effects and other aspects of similarity, such as cue overlap.

The notion of cue overlap opens a new semiotic perspective on cognition and social cognition. Just as two personality tests may exhibit an artificial correlation because a subset of items occurs in both tests, perception in a Brunswikian multiple-cue world may confound the meaning of distal concepts. Stereotypes linking masculinity and leadership, femininity and emotionality, or novelty and danger may arise from neither expectations nor motivational biases but may simply reflect semiotic confounding of the related concepts.

With respect to the other area of illusory correlations based on distinctiveness, the simulations reveal a similar need to distinguish between fundamentally different cases. Most important, the model highlights the need to conceptually and operationally distinguish between the infrequency of Cell D and the enhanced encoding or memory of Cell D information. An experimental analog of the present simulations would involve a design in which the infrequency of Cell D is manipulated independently of the salience of stimuli. For instance, the stimulus distribution might be equal ( a = b = c = d = 10) or skewed ( a = 20, b = 10, c = 10, d = 5), whereas an orthogonal manipulation might give extra attention to either Cell D or Cell C events. The very crossing of both manipulations in one design shows that both factors are conceptually different, as explicated in BIAS. Whereas one factor (enhanced salience) presupposes a biased process, the other factor (infrequency) produces illusory correlations through a purely unbiased aggregation process. To my knowledge, such experiments have rarely been conducted (but see Fiedler & Stroehm, 1986 ).

Of course, the different sources of illusory correlations are not mutually exclusive. Although skewed frequency distributions alone can produce illusions ( Fiedler, 1991 ; Smith, 1991 ), this does not preclude that infrequent events can have a memory advantage, as suggested by Hamilton and colleagues ( Hamilton, Dugan, & Trolier, 1985 ; Hamilton & Sherman, 1989 ). However, as just mentioned, both sources have not been separated in the same design; experimenters have confined themselves to manipulating the infrequency of one cell (e.g., negative behavior in a minority). Thus, the crucial theoretical question is whether there is cogent evidence to support the mediational assumption that infrequency effects observed in these experiments are actually due to a memory advantage for the most infrequent observations.

Proponents of such an account have pointed to occasionally obtained evidence for enhanced recall ( Hamilton et al., 1985 ; Hamilton & Sherman, 1989 ; Mullen & Johnson, 1990 ) or prolonged encoding latencies of infrequent stimuli ( Johnson & Mullen, 1994 ). However, a critical inspection of this research shows that evidence for enhanced memory is hard to replicate ( Fiedler, Russer, & Gramm, 1993 ; Klauer & Meiser, 1998 ) and that the latency data suffer from a failure to control for speed—accuracy trade-offs. With regard to accuracy, memory for infrequent events is worst in that judgments of this category exhibit the strongest bias. Thus, alleged evidence for enhanced memory has been typically confused with overestimation or exaggeration effects. When more refined methods are used to analyze recall performance, such as signal-detection analysis ( Fiedler, Freytag, Walther, & Nickel, 1997 ; Fiedler et al., 1993 ); or multinomial modeling ( Klauer & Meiser, 1998 ), there is no support for enhanced memory of infrequent stimuli. However, a strong response bias is apparent in that judges associate the frequent level of one variable with the frequent level of the other variable (e.g., positive behavior with the majority).

Apart from the empirical evidence, the distinctiveness account of infrequency effects suffers from a serious problem, as neatly delineated by Hunt (1995) . Distinctiveness is used as a theoretical construct to explain the impact of infrequency on correlation judgments, indicating that infrequent observations are salient and prominent in memory. However, at the operational level, distinctiveness (the theoretical construct) is either equated with infrequency (the independent variable) or inferred from the resulting bias (the dependent variable). Operationally independent measures of distinctiveness are extremely rare. Thus, the same variables appear in the explanation as in the phenomena to be explained.

General Discussion

The present review has identified several strands of research on illusory correlation that have received considerable attention in cognitive and social psychology. The systematic distortion ( Shweder, 1982 ) of correlation assessments provides a challenging research topic with serious implications for many applied areas such as diagnostics, stereotyping, marketing, and decision making.

However, from a theoretical perspective, the review revealed that explanations of illusory correlations have not been embedded in a comprehensive model and important distinctions have gone unnoticed. Almost all accounts have referred to biased cognitive processes and selective weighting as sources of illusory correlations. But the precise cognitive algorithms have not been spelled out clearly, and potential alternatives have been neglected. In different paradigms, theoretical accounts have emphasized either a processing advantage of expectancy-congruent information or an advantage of expectancy-discrepant, distinctive information. Both assumptions arise from two basic gestalt principles, congruency and distinctiveness, the plausibility of which may have hindered deeper, more critical analyses.

Within the framework of a Brunswikian, probabilistic multiple-cue model, BIAS, a simple associative algorithm, was proposed; this model provides a comprehensive framework for various types of illusory correlations. Within this framework, the reviewed phenomena could be integrated. The model also helped to elucidate the underlying processes, to distinguish analytically between qualitatively different variants of the illusion, and to point out new variants and formulate open questions for future research.

Simulations highlighted the possibility that so-called expectancy-based illusory correlations may, to an unknown degree, reflect an influence of similarity that is independent of expectations. An alternative, semiotic account suggests that cue overlap alone can cause illusory correlations between semantically similar attributes. It was also clarified that genuine expectancy effects can reflect different processes (cf. Hamilton, 1981 ) such as selective information search, biased encoding, forgetting, or differential judgment weights.

Within the other major domain, distinctiveness-based illusory correlations, the model also served to distinguish several variants of genuine distinctiveness effects. Illusory correlations may be due to the enhanced salience of selected stimuli during encoding, to enhanced memory for particular event combinations, or to an extra weight in the final judgment stage. However, without any such bias, an illusion can also arise from skewed frequency distributions alone. Because prior experimental research has failed to isolate these separate sources, many findings remain equivocal. Experimental research is strongly needed to manipulate the different types of distinctiveness in an orthogonal fashion. Skewed frequencies alone produce the illusion, without biased processing; thus, an unequivocal distinctiveness effect requires a stronger illusion than the pure infrequency bias.

With reference to an explicit and transparent algorithm such as BIAS, it is not only possible to clarify theoretical issues related to illusory correlations. The algorithm may also facilitate recognition of other paradigms, usually treated under different labels, as hidden cases of illusory correlations. Four such related paradigms are addressed briefly, and the BIAS framework is used to widen the scope of illusory correlation research.

Intergroup Discrimination

As has been shown, BIAS correctly predicts that the same prevalence of desirable over undesirable behaviors is more readily detected in a large (majority) than in a small (minority) sample or group ( Fiedler, 1991 ; Hamilton & Sherman, 1989 ). However, such a constellation is by no means confined to minority issues. Regardless of the actual group size, people may have more information on one group than another (as a result of familiarity, proximity, etc.). Because more observations are usually available on one's in-group than one's out-group, in the environment as well as in memory, the (normative) predominance of positive, desirable behavior should be more apparent for the in-group.

This sort of an illusory correlation is supported by hundreds of intergroup studies showing a relative in-group-serving bias ( Messick & Mackie, 1989 ; Tajfel, 1982 ). Moreover, if observations refer to different behavioral trait dimensions, BIAS predicts that more different traits should be detected from the larger sample of the in-group, leading to a more differentiated and less homogeneous impression of the in-group than the out-group ( Judd & Park, 1988 ; Linville, Fischer, & Salovey, 1989 ). Thus, when applied to the illusory correlation between group membership (in-group vs. out-group) and valence (positive vs. negative), BIAS explains two major intergroup phenomena at the same time, the in-group-serving bias and the out-group homogeneity effect ( Fiedler, Kemmelmeier, & Freytag, 1999 ).

As an aside, a curious conflict is created when the asymmetry of in-groups and out-groups is reframed as an illusory correlation. The out-group homogeneity effect has been explained by the assumption that the smaller samples of out-group information are represented as abstract prototypes, whereas the richer samples of in-group information are represented to a much greater extent on distinctive, detailed information about individual cases ( Judd & Park, 1988 ; Park & Hastie, 1987 ). This is in sharp contrast to the basic assumption in the illusory correlation paradigm that small groups (infrequent event categories) are more distinctive than large groups ( Hamilton & Sherman, 1989 ). The BIAS model clarifies that both assumptions are not strictly necessary. Favorable and differentiated judgments of large groups can be simulated without different processing or representation assumptions.

Illusory Hypothesis Verification

Unequal sample sizes, as a source of illusory correlations, may be self-generated rather than provided by the experimenter or the environment. People may, for several reasons, think about, discuss, and search more information on one category than another. For instance, in an election campaign, voters may expose themselves more to arguments of their own party than to arguments of an opponent party, even when the argument pools are equally large (or infinite). Similarly, unequal samples may result from lopsided discussions or memory search, producing the same kind of illusion as between in-groups and out-groups or majorities and minorities.

The crucial role of sample size, or aggregation, speaks to the well-known verification tendency in hypothesis testing, commonly referred to as confirmation bias ( Snyder, 1984 ) or self-fulfilling prophecies ( Jussim, 1986 ; Kukla, 1994 ). When trying to determine whether their interview partner is an extravert, people form a more extraverted impression of the partner than when the question focuses on the partner's introversion ( Snyder & Swann, 1978 ; Swann, Guiliano, & Wegner, 1982 ; Zuckerman, Knee, Hodgins, & Miyake, 1995 ). These and countless similar findings can be reframed as infrequency-based illusory correlations, for two reasons. First, hypothesis testers typically engage in positive testing ( Klayman & Ha, 1987 ); that is, they gather larger samples on the focused category (e.g., extraverted behaviors) than the unfocused category (introverted behaviors). Second, confirming answers are generally more likely in social communication than negative, disconfirming answers ( Zuckerman et al., 1995 ). Both tendencies together will produce the kind of skewed distribution that BIAS has shown to produce illusory correlations in the absence of any further bias. Experimental support for this was recently presented by Fiedler, Walther, & Nickel (1999) .

Note also that the BIAS algorithm predicts that another variant of illusory correlations may contribute to the verification bias. To the extent that confirming data (e.g., actually observed behaviors) are more diagnostic than disconfirming data (e.g., omitted, unobserved behaviors), judgments should be mainly determined by confirming evidence for the focused category. As shown in the simulations, enhanced diagnosticity alone can create illusory correlations, or verifications ( Trope & Bassok, 1983 ; Trope & Thompson, 1997 ).

Learning and Conditioning

The affinity of the BIAS model to learning processes is immediately evident. On the basis of Garcia and Koelling's (1966) pioneering work, Seligman (1970) introduced the notion of preparedness into the field of classical conditioning. Organisms are prepared or predisposed to learn the association of an unconditional stimulus (e.g., sickness) to certain conditional stimuli (olfactory sensations) more readily than to others (electrical shock). This phenomenon is often treated in a manner similar to another manifestation of an expectancy effect. Over a long evolutionary period, an organism seems to have learned that olfactory sensations most likely predict sickness. Such an interpretation is facilitated by the neighborhood of other conditioning experiments using more meaningful stimuli, showing, for instance, that frowning faces are more easily associated with an aversive unconditioned stimulus than friendly faces ( Orr & Lanzetta, 1980 ). According to the present approach, this may not be justified. Whereas the readiness to associate aversive shock with frowning (as opposed to smiling) faces may indeed reflect an expectancy effect, the preparedness to associate sickness and olfactory sensations may be due to confounded, overlapping cues in the area of odor and taste.

At the same time, there is ample evidence for an important role of distinctiveness in conditioning, especially with reference to the leading model of Rescorla and Wagner (1972) . Unexpected conditioned stimulus—unconditioned stimulus pairings that are not predicted from previous stimulus pairings (i.e., distinctive events) cause more learning progress than nonsurprising, anticipated events, as evident in the blocking effect ( Kamin, 1968 ; Sanbonmatsu, Akimoto, & Gibson, 1994 ).

Bayesian Inference

Countless experiments have been published on the so-called base-rate neglect in probabilistic inference. In a typical task ( Gigerenzer & Hoffrage, 1995 ), judges are asked to estimate the probability of breast cancer given a positive mammography based on statistical data on the contingency between these two variables. Judges normally exaggerate the conditional probability of breast cancer given a positive mammography, as if they were ignoring the low base rate of the criterion event. The same bias can be obtained with sequential observations, just as in illusory correlation experiments ( Fiedler, Brinkmann, Betsch, & Wild, in press ).

Again, the common explanation of base-rate neglect, in terms of the representativeness heuristic ( Kahneman & Tversky, 1972 ), highlights the similarity in meaning of breast cancer and positive mammography, providing another case for the congruence rule. Within the BIAS approach, viable alternatives are immediately apparent (see Fiedler et al., 1999 ). The same sort of overestimation of the co-occurrence of breast cancer and positive mammography may originate in the enhanced diagnosticity of positive information (concentrating on women with breast cancer and ignoring women without breast cancer) or an infrequency-based illusory correlation (yielding an overestimation of rare events, such as breast cancer).

What Are the Cognitive Properties of BIAS?

Crucial to understanding the "cognitive properties" of the BIAS algorithm is that it is sensitive to event frequencies as well as similarities or extensional (statistical) as well as intensional (meaning-related) information ( Fiedler & Stroehm, 1986 ). BIAS integrates both sources of information within the same algorithm. The statistical relation between x and y is represented by the number of stimulus items making up the bivariate distribution. The semantic similarity of "x" and "y" enters as the cue overlap of respective segments used to represent x and y . Thus, whereas the matrix columns (see Figure 3 ) reflect the statistical stimulus distribution, the cue composition across rows contains the intensional (similarity) information. Both influences, event frequencies and similarity, can principally compensate each other. When the statistical distribution (of columns) does not support a correlation, a judgment prompt that activates many overlap cues will distort the resulting judgment in a way that exaggerates the similar, overlapping attributes. In this way, BIAS helps to bridge the categorical gap between two sources of seemingly incomparable information ( Shweder, 1977b ).

Closely related to the two sources of frequency-related and similarity-related information are the two major "cognitive properties" of BIAS that are responsible for its ability to simulate so many empirical phenomena (cf. Fiedler, 1996 ; Fiedler et al., in press ). These two key properties are the differential aggregation resulting from unequal statistical samples and the built-in similarity function resulting from overlapping cues. BIAS shares these properties with other connectionist approaches involving distributed representations of noisy data ( Kashima & Kerekes, 1994 ; Kruschke, 1992 ; McClelland & Rumelhart, 1985 ; Smith, 1991 , 1996 ). It is no wonder that these alternative models would also allow for simulations of illusory correlations. BIAS was only chosen as the simplest approach with a minimum of assumptions and parameters. What accounts for illusory correlations is not specific parameters or functions but the basic qualitative properties of aggregation and semiotic cue overlap.

Conclusion

The intended message of the present article, if it was conveyed successfully, can be summarized as follows. Illusory correlations have attracted considerable research interest in many fundamental as well as applied domains. However, their popular theoretical explanations in terms of two basic gestalt rules, congruency and distinctiveness, remained incomplete and did not reach the level of a clearly spelled out algorithm. A review of the empirical literature revealed three major research areas that developed in relative isolation, with little cross- referencing: illusory correlations based on expectancies, those based on the asymmetry of positive and negative attributes, and those based on stimulus infrequencies. Different theory heuristics have guided the research in these areas, pointing to separate aspects of the cognitive process. However, no comprehensive theoretical framework has been developed within which these different aspects of the cognitive process can be located and pitted against each other.

As a step toward such a framework, a simple distributive learning model was presented as a transparent computer algorithm. This model can account for virtually all qualitative variants of illusory correlations, with very few assumptions. In addition to its simplicity and explanatory value, the model gives rise to alternative explanations of old phenomena and original predictions of neglected types of illusory correlations. Moreover, the model helps to integrate research from other paradigms that are not commonly recognized as variants of illusory correlations.

However, perhaps the most important theoretical insight gained from the present approach is an understanding of how and why it is possible that biased correlation assessments need not originate in biased processes. Although it is hardly surprising that biased cognitive processes (due to expectancies, distinctiveness, or salience) can lead to biased judgments, a more intriguing theoretical issue is how aggregation effects and cue overlap can cause biased outcomes in a completely unbiased information-processing device.

References

Acorn, D. A., Hamilton, D. L. & Sherman, S. J. (1988). Generalization of biased perceptions of groups based on illusory correlations. Social Cognition, 6, 345-372.

Allan, L. G. (1993). Human contingency judgments: Rule based or associative? Psychological Bulletin, 114, 435-448.

Allan, L. G. & Jenkins, H. M. (1983). The effect of representations of binary variables on judgment of influence. Learning and Motivation, 14, 381-405.

Alloy, L. B. & Abramson, L. Y. (1979). Judgment of contingency in depressed and nondepressed students: Sadder but wiser? Journal of Experimental Psychology: General, 108, 441-485.

Alloy, L. B. & Tabachnik, N. (1984). Assessment of covariation by humans and animals: The joint influence of prior expectations and current situational information. Psychological Review, 91, 112-149.

Anderson, C. A. (1995). Implicit personality theories and empirical data: Biased assimilation, belief perseverance and change, and covariation detection sensitivity. Social Cognition, 13, 25-48.

Anderson, C. A. & Kellam, K. L. (1992). Belief perseverance, biased assimilation, and covariation assessment. Personality and Social Psychology Bulletin, 18, 555-565.

Anderson, J. & Sheu, C. F. (1995). Causal inferences as perceptual judgments. Memory & Cognition, 23, 510-524.

Arcuri, L. & Forzi, M. (1988). Prior expectations and manner of data presentation as factors influencing covariation estimates for incomplete contingency tables. Acta Psychologica, 67, 121-133.

Arkes, H. R. & Harkness, A. R. (1983). Estimates of contingency between two dichotomous variables. Journal of Experimental Psychology: General, 112, 117-135.

Baker, A. G., Berbrier, M. W. & Vallee-Tourangeau, F. (1989). Judgments of a 2 × 2 contingency table: Sequential processing and the learning curve. Quarterly Journal of Experimental Psychology, 41B, 65-97.
Baker, D. P. & Jones, D. P. (1993). Creating gender equality: Cross-national gender stratification and mathematical performance. Sociology of Education, 66, 91-103.

Basow, S. A., Smither, J. W., Rupert, L. & Collins, H. (1989). The effect of satisfaction and gender on self-evaluations of task performance. Sex Roles, 20, 413-427.

Baumgartner, H. (1995). On the utility of consumers' theories in judgments of covariation. Journal of Consumer Research, 21, 634-643.

Benedict, J. O. (1991). Judgment of covariation in classical and instrumental conditioning contexts. Bulletin of the Psychonomic Society, 29, 457-459.

Berman, J. S. & Kenny, D. A. (1976). Correlational bias in observer ratings. Journal of Personality and Social Psychology, 34, 263-273.

Berndsen, M., Spears, R. & van der Pligt, J. (1996). Illusory correlation and attitude-based vested interest. European Journal of Social Psychology, 26, 247-264.

Bettman, J. R., John, D. R. & Scott, C. A. (1986). Covariation assessment by consumers. Journal of Consumer Research, 13, 316-326.

Billman, D., Bornstein, B. & Richards, J. (1992). Effects of expectancy on assessing covariation in data: "Prior belief" versus "meaning." Organizational Behavior and Human Decision Processes, 53, 74-88.

Brewer, M. B. (1979). In-group bias in the minimal intergroup situation: A cognitive-motivational analysis. Psychological Bulletin, 36, 307-324.

Brunswik, E. (1955). Representative design and probabilistic theory in a functional psychology. Psychological Review, 62, 193-217.

Brunswik, E. (1956). Perception and the representative design of experiments. (Berkeley: University of California Press)
Busemeyer, J. R. (1991). Intuitive statistical estimation.(In N. H. Anderson (Ed.), Contributions to information integration theory (Vol. 1, pp. 187—215). Hillsdale, NJ: Erlbaum.)
Cacioppo, J. T., Gardner, W. L. & Berntson, G. G. (1997). Beyond bipolar conceptualizations and measures: The case of attitude and evaluative space. Personality and Social Psychology Review, 1, 3-25.
Camerer, C. (1988). Illusory correlations in perceptions and predictions of organizational traits. Journal of Behavioral Decision Making, 1, 77-94.

Casas, J. M., Brady, S. & Ponterotto, J. G. (1983). Sexual preference biases in counseling: An information processing approach. Journal of Counseling Psychology, 30, 139-145.

Chapman, G. B. & Robins, S. J. (1990). Cue interaction in human contingency judgment. Memory & Cognition, 18, 537-545.

Chapman, L. J. (1967). Illusory correlation in observational report. Journal of Verbal Learning and Verbal Behavior, 6, 151-155.

Chapman, L. J. & Chapman, J. P. (1967). Genesis of popular but erroneous psychodiagnostic observations. Journal of Abnormal Psychology, 72, 193-204.

Chapman, L. J. & Chapman, J. P. (1969). Illusory correlations as an obstacle to the use of valid psychodiagnostic signs. Journal of Abnormal Psychology, 74, 271-280.

Chatlosh, D. L., Neunaber, D. L. & Wasserman, E. A. (1985). Response-outcome contingency: Behavioral and judgmental effects of appetitive and aversive outcomes with college students. Learning and Motivation, 16, 1-34.
Cheng, P. W. (1997). From covariation to causation: A causal power model. Psychological Review, 104, 367-405.

Cheng, P. W. & Novick, L. R. (1990). A probabilistic contrast model of causal induction. Journal of Personality and Social Psychology, 58, 545-567.

Crocker, J. (1981). Judgment of covariation by social perceivers. Psychological Bulletin, 90, 272-292.

Crocker, J. (1982). Biased questions in judgment of covariation studies. Personality and Social Psychology Bulletin, 8, 214-220.

D'Andrade, R. L. (1974). Memory and the assessment of behavior.(In H. M. Blalock, Jr. (Ed.), Measurement in the social sciences (pp. 159—186). Chicago: Aldine.)
Dawes, R. M. (1989). Experience and validity of clinical judgment: The illusory correlation. Behavioral Sciences and the Law, 7, 457-467.

de Jong, P. J. & Merckelbach, H. (1993). Covariation bias, classical conditioning, and phobic fear. Integrative Physiological and Behavioral Science, 28, 167-170.

de Jong, P. J., Merckelbach, H. & Arntz, A. (1990). Illusory correlation, on-line probability estimates, and electrodermal responding in a (quasi)-conditioning paradigm. Biological Psychology, 31, 201-212.

de Jong, P. J., Merckelbach, H. & Arntz, A. (1995). Covariation bias in phobic women: The relationship between a priori expectancy, on-line expectancy, autonomic responding, and a posteriori contingency judgment. Journal of Abnormal Psychology, 104, 55-62.

de la Haye, A. M. & Lauvergeon, G. (1991). Precessus des mémoire dans la formation des corrélations illusoires [Memory processes in the formation of illusory correlations]. Psychologie Francaise, 36, 67-77.

de Soto, C. B. (1960). Learning a social structure. Journal of Abnormal and Social Psychology, 60, 417-421.

DiBattista, D. & Shepherd, M. L. (1993). Primary school teachers' beliefs and advice to parents concerning sugar consumption and activity in children. Psychological Reports, 72, 47-55.

Dickinson, A., Dhanks, D. & Evenden, J. (1984). Judgment of act-outcome contingency: The role of selective attribution. Quarterly Journal of Experimental Psychology, 36A, 29-50.
Dowling, J. F. & Graham, J. R. (1976). Illusory correlation and the MMPI. Journal of Personality Assessment, 40, 531-538.

Einhorn, H. J. & Hogarth, R. M. (1978). Confidence in judgment: Persistence in the illusion of validity. Psychological Review, 85, 395-416.

Eysenck, H. J. (1956). Wege und Abwege der Psychologie ([Ways and wrong ways of psychology]. Hamburg, Germany:Rowohlt.)
Feldman, J. M., Camburn, A. & Gatti, G. M. (1986). Shared distinctiveness as a source of illusory correlation in performance appraisal. Organizational Behavior and Human Decision Processes, 37, 34-59.

Festinger, L. (1957). A theory of cognitive dissonance. (Evanston, IL: Row, Peterson)
Fiedler, K. (1985). Kognitive Strukturierung der sozialen Umwelt ([Cognitive structuring of the social environment].Göttingen, Germany: Hogrefe.)
Fiedler, K. (1991). The tricky nature of skewed frequency tables: An information loss account of distinctiveness-based illusory correlations. Journal of Personality and Social Psychology, 60, 24-36.

Fiedler, K. (1996). Explaining and simulating judgment biases as an aggregation phenomenon in probabilistic, multiple-cue environments. Psychological Review, 103, 193-214.

Fiedler, K. & Armbruster, T. (1994). Two halfs may be more than one whole: Category-split effects on frequency illusions. Journal of Personality and Social Psychology, 66, 633-645.

Fiedler, K., Brinkmann, J., Betsch, T. & Wild, B.A sampling approach to biases in conditional probability judgments: Beyond baserate neglect and statistical format. Journal of Experimental Psychology: General, , (in press)
Fiedler, K., Freytag, P., Walther, E. & Nickel, S. (1997). Illusory correlations, stereotyping and active information search. (Manuscript submitted for publication)
Fiedler, K., Hemmeter, U. & Hofmann, C. (1984). On the origin of illusory correlations. European Journal of Social Psychology, 14, 191-201.

Fiedler, K., Kemmelmeier, M. & Freytag, P. (1999). Explaining asymmetric intergroup judgments in terms of differential aggregation: Computer simulations of old and new evidence.(European Review of Social Psychology. )
Fiedler, K., Russer, S. & Gramm, K. (1993). Illusory correlations and memory performance. Journal of Experimental Social Psychology, 29, 111-136.

Fiedler, K. & Stroehm, W. (1986). The use of statistical, spatial-temporal, and intensional information in judgments of contingency. European Journal of Social Psychology, 16, 385-398.
Fiedler, K., Walther, E. & Nickel, S. (1999). The auto-verification of social hypotheses: Stereotyping and the power of sample size. Journal of Personality and Social Psychology, 77, 5-18.
Garcia, J. & Koelling, R. A. (1966). Relation of cue to consequence in avoidance learning. Psychonomic Science, 4, 123-124.

Garcia-Marques, L. & Hamilton, D. L. (1996). Resolving the apparent discrepancy between the incongruency effect and the expectancy-based illusory correlation effect: The TRAP model. Journal of Personality and Social Psychology, 71, 845-860.
Garner, W. R. (1978). Aspects of a stimulus: Features, dimensions and configurations.(In E. Rosch & B. B. Lloyd (Eds.), Cognition and categorization (pp. 99—133). Hillsdale, NJ: Erlbaum.)
Gigerenzer, G. & Hoffrage, U. (1995). How to improve Bayesian reasoning without instructions: Frequency formats. Psychological Review, 102, 684-704.

Hamilton, D. L. (1981). Illusory correlations as a basis for stereotyping.(In D. L. Hamilton (Ed.), Cognitive processes in stereotyping and intergroup behavior (pp. 115—144). Hillsdale, NJ: Erlbaum.)
Hamilton, D. L., Dugan, P. M. & Trolier, T. K. (1985). The formation of stereotypic beliefs: Further evidence for distinctiveness-based illusory correlations. Journal of Personality and Social Psychology, 48, 5-17.
Hamilton, D. L. & Gifford, R. K. (1976). Illusory correlation in interpersonal perception: A cognitive basis of stereotypic judgments. Journal of Experimental Social Psychology, 12, 392-407.

Hamilton, D. L. & Rose, R. L. (1980). Illusory correlation and the maintenance of stereotypic beliefs. Journal of Personality and Social Psychology, 39, 832-845.

Hamilton, D. L. & Sherman, S. J. (1989). Illusory correlations: Implications for stereotype theory and research.(In D. Bar-Tal, C. F. Graumann, A. W. Kruglanski, & W. Stroebe(Eds.), Stereotype and prejudice: Changing conceptions (pp. 59—82). New York: Springer.)
Harkness, A. R., DeBono, K. G. & Borgida, E. (1985). Personal involvement and strategies for making contingency judgments: A stake in the dating game makes a difference. Journal of Personality and Social Psychology, 49, 22-32.

Haslam, S. A. & McGarty, C. (1994). Problems with measurement of illusory correlation. European Journal of Social Psychology, 24, 611-621.

Haslam, S. A., McGarty, C. & Brown, P. M. (1996). The search for differentiated meaning is a precursor to illusory correlation. Personality and Social Psychology Bulletin, 22, 611-619.

Hastie, R. (1980). Memory for information that confirms or contradicts a general impression.(In R. Hastie, T. M. Ostrom, E. B. Ebbesen, R. S. Wyer, D. L.Hamilton, & D. E. Carlston (Eds.), Person memory: The cognitive basis of social perception (pp. 155—177). Hillsdale, NJ: Erlbaum.)
Heider, F. (1958). The psychology of interpersonal relations. (New York: Wiley)
Heit, E. (1993). Modeling the effects of expectations on recognition memory. Psychological Science, 4, 244-252.

Hintzman, D. L. (1988). Judgments of frequency and recognition memory in a multiple-trace memory model. Psychological Review, 95, 528-551.

Hoffman, C. & Hurst, N. (1990). Gender stereotypes or rationalization? Journal of Personality and Social Psychology, 58, 197-208.

Hoffman, P. J. (1960). The paramorphic representation of clinical judgment. Psychological Bulletin, 47, 116-131.

Hunt, R. R. (1995). The subtlety of distinctiveness: What von Restorff really did. Psychonomic Bulletin & Review, 2, 105-112.

Inhelder, B. & Piaget, J. (1958). The growth of logical thinking from childhood to adolescence. (New York: Basic Books)
Jenkins, H. M. & Sainsbury, R. S. (1970). Discrimination learning with the distinctive feature on positive or negative trials.(In D. Mostofski (Ed.), Attention: Contemporary theory and analysis (pp. 239—273). New York: Appleton-Century-Crofts.)
Jenkins, H. M. & Ward, W. C. (1965). Judgment of contingency between responses and outcomes. Psychological Monographs, 79(1, Whole No. 594),
John, D. R., Scott, C. A. & Bettman, J. R. (1986). Sampling data for covariation assessment: The effect of prior beliefs on search patterns. Journal of Consumer Research, 13, 38-47.
Johnson, C. & Mullen, B. (1993). The determinants of differential group evaluations in distinctiveness-based illusory correlations in stereotyping. British Journal of Social Psychology, 32, 253-263.

Johnson, C. & Mullen, B. (1994). Evidence for the accessibility of paired distinctiveness in distinctiveness-based illusory correlation in stereotyping. Personality and Social Psychology Bulletin, 20, 65-70.

Jones, R. A., Scott, J., Solernou, J., Noble, A., Fiala, J. & Miller, K. (1977). Availability and formation of stereotypes. Perceptual and Motor Skills, 44, 631-638.

Judd, C. M. & Park, B. (1988). Outgroup-homogeneity: Judgments of variability at the individual and the group levels. Journal of Personality and Social Psychology, 54, 778-788.

Jussim, L. (1986). Self-fulfilling prophecies: A theoretical and integrative review. Psychological Review, 93, 429-445.

Kahneman, D. & Tversky, A. (1972). A judgment of representativeness. Cognitive Psychology, 3, 430-454.

Kamin, L. J. (1968). "Attention-like" processes in classical conditioning.(In M. R. Jones (Ed.), Miami Symposium on the Prediction of Behavior 1967: Aversive stimulation (pp. 9—31). Coral Gables, FL: University of Miami Press.)
Kammann, R. & Campbell, K. (1982). Illusory correlation in popular beliefs about the causes of happiness. New Zealand Psychologist, 11, 52-63.

Kao, S.-F. & Wasserman, E. A. (1993). Assessment of an information integration account of contingency judgment with examination of subjective cell importance and method of information presentation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 1363-1386.

Kashima, Y. & Kerekes, A. R. Z. (1994). A distributed memory model of averaging phenomena in person impression formation. Journal of Experimental Social Psychology, 30, 407-455.

Kim, H. S. & Baron, R. S. (1988). Exercise and the illusory correlation: Does arousal heighten stereotypic processing? Journal of Experimental Social Psychology, 24, 366-380.

Klauer, K. C. & Meiser, T. (1998). A source-monitoring analysis of illusory correlations. (Manuscript submitted for publication)
Klayman, J. & Ha, Y.-W. (1987). Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94, 211-228.

Kozlowski, S. W. & Kirsch, M. P. (1987). The systematic distortion hypothesis, halo, and accuracy: An individual-level analysis. Journal of Applied Psychology, 72, 252-261.

Kruschke, J. K. (1992). ALCOVE. Psychological Review, 99, 22-44.

Kukla, A. (1994). The structure of self-fulfilling and self-negating prophecies. Theory and Psychology, 4, 5-33.

Kunda, Z. & Thagard, P. (1996). Forming impressions from stereotypes, traits, and behaviors: A parallel-constraint-satisfaction theory. Psychological Review, 103, 284-308.

Kurtz, R. M. & Garfield, S. L. (1978). Illusory correlation: A further exploration of Chapman's paradigm. Journal of Consulting and Clinical Psychology, 46, 1009-1015.

Lang, R. S., Raulin, M. L. & Chapman, L. J. (1978). Illusory correlation and associative strength of words. Canadian Journal of Behavioral Science, 10, 264-266.
Linville, P. W., Fischer, G. W. & Salovey, P. (1989). Perceived distributions of the characteristics of in-group and out-group members: Empirical evidence and a computer simulation. Journal of Personality and Social Psychology, 57, 165-188.

Lueger, R. J. & Petzel, T. P. (1979). Illusory correlation in clinical judgment: Effects of amount of information to be processed. Journal of Consulting and Clinical Psychology, 47, 1120-1121.

Mackie, D. M., Hamilton, D. L., Schroth, H. A., Carlisle, C. J., Gersho, B. F., Meueses, L. M., Nedler, B. F. & Reichel, L. D. (1989). The effects of induced mood on expectancy-based illusory correlations. Journal of Experimental Social Psychology, 25, 524-544.

Malmi, R. A. (1986). Intuitive covariation estimation. Memory & Cognition, 14, 501-508.

McArthur, L. Z. (1980). Illusory causation and illusory correlation: Two epistemological accounts. Personality and Social Psychology Bulletin, 6, 507-519.

McArthur, L. & Friedman, S. A. (1980). Illusory correlation in impression formation: Variations in the shared distinctiveness effect as a function of the distinctive person's age, race, and sex. Journal of Personality and Social Psychology, 39, 615-624.

McCauley, C. & Stitt, C. L. (1978). An individual and quantitative measure of stereotypes. Journal of Personality and Social Psychology, 36, 929-940.

McClelland, J. L. & Rumelhart, D. E. (1985). Distributed memory and the representation of general and specific information. Journal of Experimental Psychology: General, 114, 159-188.

McConnell, A. R., Leibold, J. M. & Sherman, S. J. (1997). Within-target illusory correlations and the formation of context-dependent attitudes. Journal of Personality and Social Psychology, 73, 675-686.

McConnell, A. R., Sherman, S. J. & Hamilton, D. L. (1994a). Illusory correlation in the perception of groups: An extension of the distinctiveness-based account. Journal of Personality and Social Psychology, 67, 414-429.

McConnell, A. R., Sherman, S. J. & Hamilton, D. L. (1994b). On-line and memory-based aspects of individual and group target judgments. Journal of Personality and Social Psychology, 67, 173-185.

McGarty, C., Haslam, S. A., Turner, J. C. & Oakes, P. J. (1993). Illusory correlation as accentuation of actual intercategory difference: Evidence for the effect with minimal stimulus information. European Journal of Social Psychology, 23, 391-410.

McKenzie, C. R. M. (1994). The accuracy of intuitive judgment strategies: Covariation assessment and Bayesian inference. Cognitive Psychology, 26, 209-239.

McNally, R. J. & Heatherton, T. F. (1993). Are covariation biases attributable to a priori expectancy biases? Behaviour Research and Therapy, 31, 653-658.

Meehan, A. M. & Janik, L. M. (1990). Illusory correlation and the maintenance of sex role stereotypes in children. Sex Roles, 22, 83-95.

Messick, D. M. & Mackie, D. M. (1989). Intergroup relations. Annual Review of Psychology, 40, 45-81.

Miller, P. M. (1971). Do labels mislead? A multiple-cue study within the framework of Brunswik's probabilistic functionalism. Organizational Behavior and Human Performance, 6, 480-500.
Mullen, B. & Johnson, C. (1990). Distinctiveness-based illusory correlations and stereotyping: A meta-analytic integration. British Journal of Social Psychology, 29, 11-28.

Mullen, B. & Johnson, C. (1995). Cognitive representation in ethnophaulisms and illusory correlation in stereotyping. Personality and Social Psychology Bulletin, 21, 420-433.

Murphy, R. (1994). The effects of task characteristics on covariation assessment: The impact of accountability and judgment frame. Organizational Behavior and Human Decision Processes, 60, 139-155.

Mutter, S. A. & Poliske, R. M. (1994). Aging and illusory correlation in judgments of co-occurrence. Psychology and Aging, 9, 53-63.

Newman, J., Wolff, W. T. & Hearst, E. (1980). The feature-positive effect in adult human subjects. Journal of Experimental Psychology: Human Learning and Memory, 6, 630-650.

Nisbett, R. & Ross, L. (1980). Human inference: Strategies and shortcomings of social judgments. (Englewood Cliffs, NJ: Prentice Hall)
Oddou, G. & Mendenhall, M. (1984). Person perception in cross-cultural settings: A review of cross-cultural and related cognitive literature. International Journal of Intercultural Relations, 8, 77-96.

Orr, S. P. & Lanzetta, J. T. (1980). Facial expressions of emotion as conditioned stimuli for human autonomic responses. Journal of Personality and Social Psychology, 38, 278-282.

Papini, M. R. & Bitterman, M. E. (1990). The role of contingency in classical conditioning. Psychological Review, 97, 396-403.

Park, B. & Hastie, R. (1987). Perception of variability in category development: Instance- versus abstraction-based stereotypes. Journal of Personality and Social Psychology, 53, 621-635.

Pryor, J. B. (1986). The influence of different encoding sets upon the formation of illusory correlations and group impressions. Personality and Social Psychology Bulletin, 12, 216-226.

Pryor, J. B. & Stoller, L. M. (1994). Sexual cognition processes in men high in the likelihood to sexually harass. Personality and Social Psychology Bulletin, 20, 163-169.

Pyszczynski, T. & Greenberg, J. (1987). Towards an integration of cognitive and motivational perspectives on social inference: A biased hypothesis-testing model.(In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 20, pp. 297—340). New York: Academic Press.)
Read, S. J., Vanman, E. J. & Miller, L. (1997). Connectionism, parallel constraint satisfaction processes, and gestalt principles: (Re)introducing cognitive dynamics to social psychology. Personality and Social Psychology Review, 1, 26-53.

Rescorla, R. A. & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement.(In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64—99). New York: Appleton-Century-Crofts.)
Rosen, G. M. (1975). On the persistence of illusory correlations associated with the Rorschach. Journal of Abnormal Psychology, 84, 571-573.

Rothbart, M. & John, O. P. (1985). Social categorization and behavioral episodes: A cognitive analysis of the effects of intergroup contact. Journal of Social Issues, 41, 81-104.

Sanbonmatsu, D. M., Akimoto, S. A. & Gibson, B. D. (1994). Stereotype-based blocking in social explanation. Personality and Social Psychology Bulletin, 20, 71-81.

Sanbonmatsu, D. M., Shavitt, S. & Sherman, S. J. (1991). The role of personal relevance in the formation of distinctiveness-based illusory correlations. Personality and Social Psychology Bulletin, 17, 124-132.

Sanbonmatsu, D. M., Shavitt, S., Sherman, S. J. & Roskos-Ewoldsen, D. R. (1987). Illusory correlation in the perception of performance by self or a salient other. Journal of Experimental Social Psychology, 23, 518-543.

Sanbonmatsu, D. M., Sherman, S. J. & Hamilton, D. L. (1987). Illusory correlation in the perception of individuals and groups. Social Cognition, 5, 1-25.

Schaller, M. (1991). Social categorization and the formation of group stereotypes: Further evidence for biased information processing in the perception of group-behavior correlations. European Journal of Social Psychology, 21, 25-35.

Seligman, M. E. P. (1970). On the generality of the laws of learning. Psychological Review, 77, 406-418.

Shaklee, H. & Elek, S. (1988). Cause and covariate: Development of two related concepts. Cognitive Development, 3, 1-13.

Shaklee, H. & Hall, L. (1983). Methods of assessing strategies for judging covariation between events. Journal of Educational Psychology, 75, 583-594.

Shaklee, H., Holt, P., Elek, S. & Hall, L. (1988). Covariation judgment: Improving rule use among children, adolescents, and adults. Child Development, 59, 755-768.

Shaklee, H. & Mims, M. (1981). Development of rule use in judgments of covariation between events. Child Development, 52, 317-325.

Shaklee, H. & Mims, M. (1982). Sources of error in judging event covariations: Effects of memory demands. Journal of Experimental Psychology: Learning, Memory, and Cognition, 8, 208-224.

Shaklee, H. & Paszek, D. (1985). Covariation judgment: Systematic rule use in middle childhood. Child Development, 56, 1229-1240.

Shaklee, H. & Tucker, D. (1980). A rule analysis of judgments of covariation between events. Memory & Cognition, 8, 459-467.

Shaklee, H. & Wasserman, E. A. (1986). Judging interevent contingencies: Being right for the wrong reasons. Bulletin of the Psychonomic Society, 24, 91-94.

Shanks, D. R. (1985). Continuous monitoring of human contingency judgment across trials. Memory & Cognition, 13, 158-167.

Shanks, D. R. (1989). Selectional processes in causality judgment. Memory & Cognition, 17, 27-34.

Shanks, D. R. & Dickinson, A. (1987). Associative accounts of causality judgment.(In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 21, pp. 229—261). New York: Academic Press.)
Shanks, D. R. & Dickinson, A. (1991). Instrumental judgment and performance under variations in action outcome contingency and contiguity. Memory & Cognition, 19, 353-561.

Sherman, S. J., Hamilton, D. L. & Roskos-Ewoldsen, D. R. (1989). Attenuation of illusory correlation. Personality and Social Psychology Bulletin, 15, 559-571.

Shultz, T. R. & Lepper, M. R. (1996). Cognitive dissonance reduction as constraint satisfaction. Psychological Review, 103, 219-240.

Shweder, R. A. (1975). How relevant is an individual difference theory of personality? Journal of Personality, 43, 455-484.

Shweder, R. A. (1977a). Illusory correlation and the MMPI controversy. Journal of Consulting and Clinical Psychology, 45, 917-924.

Shweder, R. A. (1977b). Likeness and likelihood in everyday thought: Magical thinking in judgments about personality. Current Anthropology, 18, 637-658.

Shweder, R. A. (1982). Fact and artifact in trait perception: The systematic distortion hypothesis.(In B. A. Maher & B. Maher (Eds.), Progress in personality research (Vol. 2, pp. 65—101). New York: Academic Press.)
Smedslund, J. (1963). The concept of correlation in adults. Scandinavian Journal of Psychology, 4, 165-173.

Smith, E. R. (1991). Illusory correlation in a simulated exemplar-based memory. Journal of Experimental Social Psychology, 27, 107-123.

Smith, E. R. (1996). What do connectionism and social psychology offer each other? Journal of Personality and Social Psychology, 70, 893-912.

Smither, J. W., Collins, H. & Buda, R. (1989). When ratee satisfaction influences performance evaluations: A case of illusory correlation. Journal of Applied Psychology, 74, 599-605.

Snyder, M. (1984). When belief creates reality.(In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 18, pp. 247—305). New York: Academic Press.)
Snyder, M. & Miene, P. K. (1994). Stereotyping of the elderly: A functional approach. British Journal of Social Psychology, 33, 63-82.

Snyder, M. & Swann, W. B. (1978). Hypothesis-testing strategies in social interaction. Journal of Personality and Social Psychology, 36, 1202-1212.

Spears, R., Eiser, J. R. & van der Pligt, J. (1987). Further evidence for expectation-based illusory correlations. European Journal of Social Psychology, 17, 253-258.

Spears, R., van der Pligt, J. & Eiser, J. R. (1985). Illusory correlation in the perception of group attitudes. Journal of Personality and Social Psychology, 48, 863-875.

Spears, R., van der Pligt, J. & Eiser, J. R. (1986). Generalizing the illusory correlation effect. Journal of Personality and Social Psychology, 51, 1127-1134.

Srull, T. K. & Wyer, R. S. (1989). Person memory and judgment. Psychological Review, 96, 58-83.

Stangor, C. & McMillan, D. (1992). Memory for expectancy-congruent and expectancy-incongruent information: A review of the social and social developmental literatures. Psychological Bulletin, 111, 42-61.

Starr, B. J. & Katkin, E. S. (1969). The clinician as an aberrant actuary: Illusory correlation and the incomplete sentence blank. Journal of Abnormal Psychology, 74, 670-675.

Stroessner, S. J., Hamilton, D. L. & Mackie, D. M. (1992). Affect and stereotyping: The effect of induced mood on distinctiveness-based illusory correlations. Journal of Personality and Social Psychology, 62, 564-576.

Sugimori, S. (1991). Effects of group size upon proportion judgments of likable and unlikable members and group impressions: Small is bad, not that large is good. Japanese Journal of Experimental Social Psychology, 30, 217-227.

Swann, W. B., Guiliano, T. & Wegner, D. M. (1982). Where leading questions can lead: The power of conjecture in social interaction. Journal of Personality and Social Psychology, 42, 1025-1035.

Tajfel, H. (1982). Social psychology of intergroup relations. Annual Review of Psychology, 33, 1-39.

Taylor, S. (1991). Asymmetric effects of positive and negative events: The mobilization-minimization hypothesis. Psychological Bulletin, 110, 67-85.

Taylor, S. E., Fiske, S. T., Etkoff, N. L. & Ruderman, A. J. (1978). Categorical and contextual bases of person memory and stereotyping. Journal of Personality and Social Psychology, 36, 778-793.

Tobacyk, J. J. (1991). Superstition and beliefs about the prediction of future events. Psychological Reports, 68, 511-512.

Tomarken, A. J., Sutton, S. K. & Mineka, S. (1995). Fear-relevant illusory correlations: What types of associations promote judgmental bias? Journal of Abnormal Psychology, 104, 312-326.

Trope, Y. & Bassok, M. (1983). Information-gathering strategies in hypothesis-testing. Journal of Personality and Social Psychology, 19, 560-576.
Trope, Y. & Liberman, A. (1993). The use of trait conceptions to identify other people's behavior and to draw inferences about their personalities. Personality and Social Psychology Bulletin, 19, 553-562.

Trope, Y. & Thompson, E. P. (1997). Looking for truth in all the wrong places? Asymmetric search of individuating information about stereotyped group members. Journal of Personality and Social Psychology, 73, 229-241.

Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327-352.

von Restorff, H. (1933). Über die Wirkung von Bereichsbildungen im Spurenfeld [On the effect of field formations in trace field]. Psychologische Forschung, 18, 299-342.

Wallach, L. & Wallach, M. A. (1994). Gergen versus the mainstream: Are hypotheses in social psychology subject to empirical test? Journal of Personality and Social Psychology, 67, 233-242.

Waller, R. W. & Keeley, S. M. (1978). Effects of explanation and information feedback on the illusory correlation phenomenon. Journal of Consulting and Clinical Psychology, 46, 342-343.

Wampold, B. E., Casas, J. M. & Atkinson, D. R. (1981). Ethnic bias in counseling: An information processing approach. Journal of Counseling Psychology, 28, 498-503.

Wasserman, E. A., Chatlosh, D. L. & Neunaber, D. J. (1983). Perception of causal relations in humans: Factors affecting judgments of response-outcome contingencies under free-operant procedures. Learning and Motivation, 14, 406-432.

Wasserman, E. A., Dorner, W. W. & Kao, S.-F. (1990). Contribution of specific cell information to judgments of interevent contingency. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 509-521.

Wasserman, E. A., Elek, S. M., Chatlosh, D. L. & Baker, A. G. (1993). Rating causal relations: Role of probability in judgments of response-outcome contingency. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 174-188.

Wasserman, G. S. & Kong, K. L. (1974). Illusory correlations of brightness enhancement and transients in the nervous system. Science, 184, 911-913.

Weinstein, N. D. (1980). Unrealistic optimism about future life events. Journal of Personality and Social Psychology, 39, 806-820.

Winkel, F. W. (1990). Crime reporting in newspapers: An exploratory study of the effects of ethnic references in crime news. Social Behaviour, 5, 87-101.

Wright, J. C. & Murphy, G. L. (1984). The utility of theories in intuitive statistics: The robustness of theory-based judgments. Journal of Experimental Psychology: General, 113, 301-322.

Zuckerman, M., Knee, C. R., Hodgins, H. S. & Miyake, K. (1995). Hypothesis confirmation: The joint effect of positive test strategy and acquiescence response set. Journal of Personality and Social Psychology, 68, 52-60.

This research was supported by a grant from the Deutsche Forschungsgemeinschaft. Thanks are due to Edgar Erdfelder, Karl-Christoph Klauer, Leonel Garcia-Marques, Christian Unkelbach, and Henning Plessner for their helpful comments and constructive suggestions on a draft of this article.
Correspondence may be addressed to Klaus Fiedler, Psychologisches Institut, Universität Heidelberg, Hauptstrasse 47-51, Heidelberg, Germany, 69117.
Electronic mail may be sent to Klaus-Fiedler@psi-sv2.psi.uni-heidelberg.de
Received: July 20, 1999
Revised: September 3, 1999
Accepted: September 7, 1999

Table 1. Overview of Research on Expectancy-Based Illusory Correlations

Table 2. Overview of Research on Illusory Correlations Originating in Unequal Weighting of Different Event Combinations