Reference:
Gabora, L., Rosch,
E., & Aerts, D. (2008). Toward an
ecological theory of concepts. Ecological Psychology, 20(1), 84-116.
Toward an
Ecological Theory of Concepts
Liane Gabora
Department of Psychology,
University of British Columbia
Eleanor Rosch
Department of Psychology, University of California, Berkeley
Diederik Aerts
Leo Apostel Centre for
Interdisciplinary Studies and Department of Mathematics
Brussels Free University
Running head: Ecological Theory of Concepts
For correspondence regarding the manuscript:
Liane Gabora
liane.gabora@ubc.ca
Department of Psychology
University of British Columbia
Okanagan Campus, 3333 University Way
Kelowna BC, Canada V1V 1V7
Abstract
Psychology has had difficulty accounting for the creative, context-sensitive manner in which concepts are used. We believe this stems from the view of concepts as identifiers rather than bridges between mind and world that participate in the generation of meaning. This paper summarizes the history and current status of concepts research, and provides a non-technical summary of work toward an ecological approach to concepts. We outline the rationale for applying generalizations of formalisms originally developed for use in quantum mechanics to the modeling of concepts, showing how it is because of the role of context that deep structural similarities exist between the two. A concept is defined not just in terms of exemplary states and their features or properties, but also by the relational structures of these properties, and their susceptibility to change under different contexts. The approach implies a view of mind in which the union of perception and environment drives conceptualization, forging a web of conceptual relations or ‘ecology of mind’.
Keywords: concepts, context, entanglement, exemplar, prototype, quantum structure
Toward an Ecological Theory of Concepts
This paper reviews the history of concepts research over the last
thirty years, summarizes outstanding problems in the field, and suggests how
these problems can be addressed through an approach to concepts that is
ecological in character; that is, an approach that emphasizes how concepts are
derived from and function as participatory elements of the life activities in
which humans engage. This may seem like a strange move to some, given the
emphasis in ecological psychology on percepts rather than concepts (Heft,
2003). However, the distinction between the two is not so great as once
thought, and we see much to be gained by making not just action but complex thought
(which in turn governs more complex action) amenable to an ecological approach.
Not only could this result in a richer, more embodied theory of concepts, it
may also result in a richer view of what an ‘ecological approach’ could mean
for psychology. Specifically, we envision that something akin to the food
chains and webs of interrelations that one sees in ecologies may potentially
come to be as much a part of ecological psychology as situatedness. The
insights of Shaw, Rosen, and others about incorporating ‘relation’ into
theories of living agents are potentially applicable to the conceptualizations
that emerge through interactions between mind and world, as well as the ‘life’
of concepts interacting with one another.
Concepts are generally thought to be what enable us to interpret situations in terms of previous situations that we judge as similar to the present. They can be concrete, like chair, or abstract, like beauty. Traditionally they have been viewed as internal structures that represent a class of entities in the world. However, increasingly they are thought to have no fixed representational structure, their structure being dynamically influenced by the contexts in which they arise (Hampton, 1997; Riegler, Peschl & von Stein, 1999). This is evidenced by their flexibility to change their structure in an infinite variety of ways, given the infinite number of context that can affect them. For example, the concept baby can be applied to a real human baby, a doll made of plastic, or a small stick figure painted with icing on a cake. A songwriter might think of baby in the context of needing a word that rhymes with maybe. And so forth. Moreover, while in the past, the primary function of concepts has been thought to be the identification of items as instances of a particular class, increasingly they are seen not just to identify but to actively participate in the generation of meaning (Rosch, 1999). For example, if one refers to a small wrench as a baby wrench, one is not trying to identify the wrench as an instance of baby, nor identify a baby as an instance of wrench. Thus concepts are doing something more subtle and complex than internally representing things in the external world. What this ‘something more’ is and how it functions may well be the most important task facing psychology today; it is vital to understanding the adaptability and compositionality of human thought.
We begin by briefly summarizing the history of attempts to formalize what is meant by a concept and how situations get classified as instances of a concept. Then we outline some key problems that have arisen accounting for the contextual nature of concepts, what happens when two or more concepts combine, and the role of similarity in categorization. This is followed by a summary of what we see as feasible steps toward an approach to concepts that is ecological in character, using the State-COntext-Property or SCOP, theory of concepts. We show how such an approach can provide a means for handling problems of context, concept combination, similarity judgments, and compositionality. Specifically we discuss an approach that applies generalizations of mathematics originally developed for quantum mechanics to the description of concepts (Gabora & Aerts 2002; Aerts & Gabora 2005a, b). This is not the first or only application of concepts and formalisms first used in quantum mechanics to psychology (e.g. Bruza & Cole, 2005; Busemeyer et al. 2006; Busemeyer & Wang, 2007; Gibson & Crooks, 1938; Kadar & Shaw, 2000; Nelson et al. 2003; Nelson & McEvoy, 2007; Turvey & Shaw, 1995; Widdows, 2003; Widdows & Peters, 2003). We will argue that the persistence of such attempts may reflect that the two domains (quantum mechanics and the psychology of concepts) have a shared underlying structure, and that this is related to inescapably strong effects of context.
In fact what we do with SCOP is somewhat contrary to
how usually theories arise. Usually, theories arise first as specific, rather ad
hoc models, which over time give rise to abstract
theories (such that the first ad hoc models come
out as special cases of the abstract theory). By introducing SCOP we go
immediately to the abstract theory, though it is specified within SCOP how to
work out specific concrete models. There are many other examples in the history
of science where it happened this way. For example, Einstein’s relativity
theory was from the start the most general theory, and its validity was slowly
confirmed over decades by people working out specific models for specific
situations using the general structures given by Einstein's theory. But even
now after 80 years this is still an ongoing process, still provoking debate
over certain aspects of the general theory. Newton’s work also had a flavor of
this approach (first the general abstract theory, and then concrete special
cases). In a similar fashion, with SCOP we aim at a general theory of concepts
incorporating contextuality as an intrinsic, not ad hoc element. There is however a specific remark we want to make in
relation with the fact that SCOP is a generalized quantum theory. Many theories
that were historically part of physics have now been classified as part of
mathematics, such as geometry, probability theory, and statistics. At the times
when they were considered physics they focused on modeling parts of the world
pertaining to physics. In the case of geometry this was shapes in space, and in
the case of probability theory and statistics this was the systematic estimate
of uncertain events in physical reality. These originally physical theories
have now taken their most abstract forms and are readily applied in other
domains of science, including the human sciences, since they are considered
mathematics, not physics. (An even simpler example of how a theory of
mathematics is applicable in all domains of knowledge is number theory. We all
agree that counting, as well as adding, subtracting, and so forth, can be done
independent of the nature of the object counted.) It is in this sense that we
use mathematical structures coming from quantum mechanics to build a contextual
theory of concepts without attaching the physical meaning attributed to them
when applied to the micro-world. As always in science when starting from the
general theory, the value of this approach must be apparent in its applications,
i.e. specific models worked out for specific situations. Toward the end of the
paper we mention some applications of SCOP to specific situations arising in
concepts research that are worked out in detail in other papers.
We now examine some predominant theories of concepts that have emerged, in roughly chronological order.
Concepts and categorization are the areas in psychology that deal with the ancient philosophical problem of universals; that is, with the fact that unique particular objects or events can be treated equivalently as members of a class. Most philosophers since Plato have agreed that experience of particulars as it comes moment by moment through the senses is unreliable; therefore, only stable, abstract, logical, universal categories can function as objects of knowledge and objects of reference for the meaning of words. To fulfill these functions, categories had to be exact, not vague (i.e. have clearly defined boundaries), and their members had to have attributes in common that constituted the essence of the category, e.g. the necessary and sufficient conditions for membership in the category. It followed that all members of the category were equally good with regard to membership; either they had the necessary common features or they didn’t. Categories were thus seen as logical sets, and the mathematics of classical set theory were assumed to apply to them.
This view of categories entered psychology in the form of concept learning research in the 1950s, led by the work of Jerome Bruner and associates. In one study (Bruner, Goodnow, & Austin, 1956), subjects were asked to learn concepts which were logical sets defined by explicit attributes such as red and square, combined by logical rules, such as ‘and’. Theoretical interest was focused on how subjects learned which attributes were relevant and which rules combined them. In developmental psychology, the theories of Piaget and Vygotsky were combined with the concept learning paradigm to study how children’s ill-structured, often thematic, concepts developed into the logical adult mode. For linguists, the relationship between language and concepts appeared unproblematic; words simply referred to the defining features of the concepts, and it was the job of semanticists to work out a suitable formal model that would show how this relationship could account for features such as synonymy and contradiction. Artificial stimuli were typically used in research at all levels, structured into micro-worlds in which the prevailing beliefs about the nature of categories were already built into the stimuli and task (for examples, see Bourne, Dominowski, & Loftus, 1979). Thus early empirical research could not refute the classical view since the view was built into the structure of experiments. Although since then ample evidence against the classical view has been gathered (Komatsu, 1992; Rosch, 1999; Smith & Medin, 1981), it has remained a persistent and pervasive force in Western treatments of concepts.
A major challenge to the classical view came in the 1970s in the form of evidence that actual categories in use are not the bounded, clearly defined entities required by classical logic. This was first shown with respect to colour (Rosch, 1973). Consider the following question: is red hair as good an example of red as a red fire engine? Most people answer ‘no’ to this question, and to other questions of this sort. However if categories were the sorts entities entailed by classical logic, it would not be possible for one instance to be a better or worse example than another. With examples of this sort it was shown that instances are judged to have differing degrees of membership in the category, and that colour categories have neither criterial attributes nor definite boundaries. Furthermore, psychological representation of the category appeared to be concrete rather than abstract – e.g. people universally agree that some colours match their idea or image of that colour category better than others. An extensive program of research has demonstrated that the same form of graded structure applies to categories of the most diverse kinds: perceptual categories such as colours and forms; semantic categories such as furniture, biological categories such as a woman, social categories such as occupation, political categories such as democracy, formal categories that have classical definitions such as odd number, and ad hoc goal derived categories such as things to take out of the house in a fire. Furthermore, gradients of membership must be considered psychologically important because such measures have been shown to affect virtually every major method of study and measurement used in psychological research: learning, speed of processing, expectation, association, inference, probability judgments, natural language use, and judgments of similarity (Rosch, 1999; see also Markman, 1989; Mervis & Crisafi, 1982; Mervis & Rosch, 1981; Rosch, 1973, 1978; Rosch & Lloyd, 1978; Smith & Medin, 1981).
Rosch’s theory of graded structure categorization, in its most general form, was that concepts and categories form to mirror real-world structure (of both perception and life activities) rather than logic. More specifically:
1) Prototypes. Categories form around and/or are mentally represented by salient, information rich, often imageable stimuli that become “prototypes” for the category. Other items are judged in relation to these prototypes, thus forming gradients of category membership. There need be no defining attributes which all category members have in common, and category boundaries need not be definite. Sources of prototypes are diverse: while some may be based on statistical frequencies, such as the means or modes (or family resemblance structures) for various attributes, others appear to be ideals made salient by factors such as physiology (good colors, good forms), social structure (president, teacher), culture (saints), goals (ideal foods to eat on a diet), formal structure (multiples of 10 in the decimal system), causal theories (sequences that “look” random), and individual experiences (the first learned or most recently encountered items or items made particularly salient because they are emotionally charged, vivid, concrete, meaningful, or interesting).
2) Basic-level Objects. What determines the level of abstraction at which items will be categorized? Rosch, Mervis, Gray, Johnson & Boyes-Braem (1976) argued that there is a basic level of abstraction (e.g. chair, dog) that mirrors the correlational structure of properties in the object’s real-world perception and use. Categories form, are learned, and are perceived first at this level, then further discriminated at the subordinate level (e.g. kitchen chair, spaniel) and abstracted at the superordinate level (furniture, animal). Within a given category, this same process of maximizing information through correlational structure leads to the formation of prototypes. One of the most philosophically cogent aspect of prototypes and basic objects is that, far from being abstractions of a few defining attributes, they are rich, imagistic, sensory, full-bodied mental events that serve as reference points in all of the kinds of research effects mentioned above.
A very important finding about prototypes and graded structure is how sensitive they are to context. For example, while dog or cat might be given as prototypical pet animals, lion or elephant are more likely to be given as prototypical circus animals. In a default context (no context specified), coffee or tea or coke might be listed as a typical beverage, but wine is more likely to be selected in the context of a dinner party. Furthermore, people show perfectly good category effects complete with graded structure for ad hoc, goal derived categories such as good places to hide from the Mafia. In fact, the effects of context on graded structure are ubiquitous (Barsalou, 1987). In the classical view, from the time of its origin in Greek thought, if an object of knowledge were to change with every whim of circumstance, it would not be an object of knowledge, and the meaning of a word must not change with conditions of its use. One of the great virtues of the criterial attribute assumption for its proponents had been that the hypothesized criterial attributes were just what didn’t change with context. Barsalou argued that context effects show that category prototypes and graded structure are not pre-stored as such, but rather created anew each time on the fly from more basic features or other mental structures. The extreme flexibility of categories to context effects may have even more fundamental implications.
Many reactions to the above view of categorization consisted primarily of attempts to deal with the empirical data from graded structure research without changing one’s idea of the real nature of categories as fundamentally classical (or at the very least, requiring some sort of essentialist classical mental representation structure mediating them). Rosch’s (1973) prototype theory was not presented in the form of a mathematical model and, indeed, challenged the appropriateness of set-theoretic models used by the classical view. In an influential paper, Osherson and Smith (1981) modeled prototype theory using Zadeh’s (1965) fuzzy set logic, in which conjunctive categories are computed by a maximization rule, and showed that prototypes do not follow this rule; the typicality of a conjunction is not simply a function of the typicality of its constituents. This has come to be called the ‘pet fish problem’ because guppy is rated as a good example, not of pet, nor of fish, but an excellent example of the category pet fish. They took this critique of Zadeh’s fuzzy set logic as a refutation of graded structure and prototypes. Another set of models, called probabilistic models, re-define graded structure as the probability of an item’s being classified as a member of the category. These kinds of probabilities are not actually an appropriate measure for graded structure; they do not capture the fact that people universally judge items both to be definite members of a category, and to have definitely differing degrees of membership, some better examples than others. Nor does it capture the fact that people judge some items to be factually not members, i.e. to genuinely straddle two or more categories. (Shortly we will give a more formal argument for why such statistical probabilities are inappropriate for modeling concepts.) Furthermore, in most probabilistic models, artificial categories are once again the stimuli with the same difficulties cited with respect to this strategy when we discussed the classical view.
One main issue of debate in the early models was the level of abstraction and/or detail that need be assumed in the category representation. Extreme prototype-as-abstraction models assert that only a summary representation preserving the central tendencies among category exemplars is necessary. Other investigators modeled the category representation in the form of a frequency distribution which preserves not only the central tendencies, but also some information about the shapes of the distribution and the extent of variability among exemplars. (See Barsalou, 1990; Neisser, 1989; Smith & Medin, 1981 for summaries.)
Another class of models, called decision bound or rule-based models, represent categories as regions in multidimensional space separated by a decision rule or boundary (Ashby & Maddox, 1993; Maddox & Ashby, 1993). Although in theory these boundaries can assume any form, they are assumed to be linear or quadratic, because this provides regions that are simple enough to be realistically learnable, and most likely to match the boundaries of natural categories. (Thus for example, stimuli might consist of a set of lines that vary in length and orientation linearly separated in such a way that the subject must take both length and orientation into account to decide whether a given stimulus belongs to category A or B.) This is a provocative approach, though the type of data such models can account for is limited; for example, they cannot handle category conjunctions. Furthermore, there is a lack of evidence that subjects use all-or-none cutoffs even in artificial categories (Kalish & Kruschke, 1997) much less in real-world categories.
In yet another class of models, exemplar models, a concept is represented by a set of instances or exemplars of it stored in memory (Medin, Altom, & Murphy, 1984; Nosofsky, 1988, 1992; Heit & Barsalou, 1996). Thus each exemplar has a uniquely weighted set of features, and a new item is categorized as an instance of the concept if it is sufficiently similar to the most salient previously encountered exemplars. The exemplar model has met with considerable success at predicting experimental results (e.g. Nosofsky, 1992; Tenpenny, 1995); however, it does not fully reproduce individual differences in the distributions of responses across test stimuli (Nosofsky et al., 1994), and cannot account for certain base-rate effects in categorization (Nosofsky et al., 1992). Moreover, the choice of concepts used in experiments that support the exemplar theory obscure the counterintuitiveness of the assumptions underlying it. They typically come from perceptual data, rather than data obtained using abstract concepts such as beauty or the number five. Surely when employing the concept five, one does not calculate how different the current situation is from previously encountered instances of five, such as, say, your five cousins, the five trees in your backyard, and the five buttons on your favorite shirt. One appears to have abstracted something essential out of such instances to form a concept five that no longer has much to do with the irrelevant details of particular situations. Moreover, to define a concept in terms of weighted averages for certain features presupposes that it is possible to state objectively what the relevant features are. Unless a context is specified, there is no basis for supposing that one feature is more relevant than another; otherwise one might just as well reason that because Ann could sit in either chair A, chair B, or chair C, Ann can be defined as some sort of average of a human sitting in each of these three chairs. Clearly, Ann is more than this, much as beauty is more than an average taken across certain features of salient instances of beauty. This sort of problem also plagues a related approach to concepts in which they are viewed as perceptual symbol systems, that is, simulators of sets of similar perceptually-based memories (Barsalou 1999).
Yet another class of models accounts for graded structure by dividing a concept into its core concept and processing heuristics. In this approach, the actual meaning for category terms is a classical definition onto which is added a processing heuristic or identification procedure that accounts for graded structure aspects (Osherson & Smith, 1981; Smith, Shoben & Rips, 1974). In this way, for example, odd number can ‘have’ both a classical definition and a prototype. This distinction between core concept and heuristics is also central to Wisniewski’s (1996, 1997a, b) dual process model of concept combination, which assumes that the combination process involves comparing and then aligning potentially complex but nevertheless incomplete summary schemas of the concepts being combined. This is a dangerous move, for it decouples theory from any empirical referent. The actual meaning of a category term becomes a kind of metaphysical classical entity known by logic alone that is unassailable by data, data being assigned to the peripheral processing heuristics.
All of these theories incorporating graded structure effects have a somewhat analogous problem rendering it difficult, if not impossible, to distinguish between them on the basis of empirical evidence. Each contains a model of storage that is always presented with complementary processing assumptions, which allow it to match any kind of experimental data (Barsalou, 1990). This fact, along with the difficulties cited earlier such as the inappropriate use of probabilities and the reliance on decontextualized artificial stimuli and tasks, seems to indicate that a new type of modeling is called for.
Such a new vision of concepts and categorization would seem to be offered by the view of concepts as theories (Medin, 1989; Medin & Wattenmaker, 1987; Murphy & Medin, 1985)[1]. The theories view appears to capture an important intuition that people have about concepts and categories, which is that concepts do not stand apart independently but belong to systems larger than themselves. By conceiving of concepts as theory related, the theory theorists are able to avoid talking about attributes, whether criterial or not, or about the problems of defining and measuring similarity, difficult issues for the previous views. On the other hand, context dependent effects in categorization are readily incorporated. For example, the context dependent finding that grey clouds are judged more similar to black clouds than to white clouds but grey hair more similar to white hair than to black hair is accounted for by saying that we have different theories about clouds and hair. An additional virtue of the theory view is that the persistence of the classical view of categories can be incorporated; it is seen as a persistent theory that we have about categories that children develop with age (Keil, 1989; Medin & Ortony, 1989).
There are several difficulties with the theories view (Komatsu, 1992; Fodor, 1994; Hayes et al. 2003; Rips, 1995; Rosch, 1999). One problem is that conceptual change often happens incrementally, “without a radical restructuring of ones’ beliefs or knowledge” (Hayes et al. 2003). Moreover, theory theorists never define or describe what they mean by theory, and offer not a single example of an actual theory from which findings, even one finding, in categorization research could be derived. Nor is there any attempt to show how attributes, similarity, or context (for the lack of account of which they criticize other views) could be derived from theories, either in the abstract or from specific theories. What is meant by a theory? Explicit statements that can be brought to consciousness? Any item of world knowledge? The complete dictionary and encyclopedia? Any expectation, habit, belief, desire, skill, custom, value or observed regularity? Any context? It is hard to escape the impression that for the theory theorists, absolutely anything can count as a theory, and that the word theory can be and is invoked as an explanation of any finding (somewhat like the proliferation of instincts and drives in an earlier, now defunct, psychology). If we look more closely at the experiments claimed as support for the theories view, they are primarily demonstrations of what is elsewhere in psychology called context effects. Words and concepts are interpreted differently depending upon the environments or contexts in which they occur.
Gärdenfors (2000a, b) has introduced a provocative geometrical approach to concepts. He considers not just binary features or properties, but dimensions (e.g. color, pitch, temperature, weight). Like theory theory, it exploits (to a limited extent) how attributes relate to one another. He distinguishes between integral dimensions, for which one cannot assign a value on one dimension without assigning a value on the other (e.g. hue/brightness, or pitch/loudness), and non-integral dimensions, for which one can assign a value on one dimension without assigning a value on the other (e.g. hue/size or pitch/brightness). This leads him to define domain as a set of integrable dimensions that are separate from all other dimensions. A property (generally an adjective, e.g. red) is defined as a convex region in a domain. Formally this means that if two objects v1 and v2 are both members of a concept to a certain degree then all items between v1 and v2 also satisfy this criterion, e.g. the property ‘red’ is a convex domain in a region defined by the integrable dimensions hue, saturation, and brightness. Properties generally refer to a single domain, and are thus considered properties, while concepts generally refer to many domains. A concept is defined as a set of convex regions in a number of domains (where a domain is a set of integrable dimensions that are separate from all other dimensions, as defined above), together with a salience assignment to the domains, and information about how the regions in different domains are correlated. Concept combination is then modeled as the combining of these sets of convex regions. Thus if concept X combines with concept Y to give concept XY, the region for some domain of modifier X replaces the corresponding region for Y. This ‘modifier + modified’ relationship is very language specific. For example, in red brick, one replaces the original (generic) region for color for the concept brick with the corresponding region for red. A problem with this approach is that the geometries can be very complex, and change in a context-dependent way, so it is still difficult to describe even the most mundane, everyday creative acts in terms of this model.
An ecological view was first introduced in psychology in regard to perception (Brunswik, 1947) and developed by J. J. Gibson (1979) into a system of ecological optics, which re-described both perceiving organism and perceived environment so that they formed a single unit. In Gibson’s ecological optics, light, space, motion, and other abstract properties are necessarily designated in organismically relevant and dependent ways, and the perceiving organism is necessarily described in environmentally relative and dependent ways. Perception of oneself and one’s environment are, so defined, inseparable: “The supposedly separate realms of the subjective and the objective are actually only poles of attention.” (Gibson, 1979, p. 116). More recently, other cognitive scientists have argued for the inclusive organism-environment field as the basic unit of the science. Combining physiology and philosophy, Skarda (1999) has shown in detail how dualistic perception can arise from the unbroken field of a perceptual event. Jarvilehto (1998a, l998b) reconceptualizes the relationship between organism and environment as a single system both at the micro (neural) and macro (behavioral) levels. In Trevarthen’s concept of intersubjectivity, interactions between people are codependently defined, experienced, and acted out (Trevarthen, 1993).
Gibson introduced the term affordances to refer to the functions that the perceived world offer the organism. For example, the ground affords support, enclosures afford shelter, and elongated objects afford pounding and striking. Because it is an organismically meaningful world that is perceived and acted upon, form and function are as inseparable and co-defining as perceiving subject and perceived object, and this is the information that constitutes both perception and action. “The act of throwing complements the perception of a throwable object. The transporting of things is part and parcel of seeing them as portable or not.” (Gibson, 1979, p. 235). Yet it is very obvious that the perceiver and the world perceived are experienced as different and separate. What gives?
This is where we see concepts coming into the picture. To apply an ecological approach to not percepts but concepts may seem unusual. However, we believe that the distinction between percepts and concepts may reflect what the researcher, or observer, is focused on as much as it reflects what is happening for the subject. More importantly, it is only once objects in the world have been conceptualized that they are charged with the potential to dynamically interact in myriad ways with conceptions of other objects as well as with the goals, plans, schemas, desires, attitudes, fantasies, and so forth, that constitute human mental life. And it is through these interactions that their relations are discerned, and together they thereby come to function as an integrated internal model of the world, or worldview. Thus it is when stimuli in the world come to be understood in conceptual terms that they acquire the web-like structure and self-organizing dynamics characteristic of an ecology. It is therefore our view that an ecological treatment of concepts opens up the possibility of making not just action but complex thought processes amenable to a more ecological approach (as suggested by Gregory Bateson (1973) some time ago). Rosch (1999) argues that it is the role of concepts to provide a bridge between what we think of as mind and what we think of as world, and has articulated this position in terms of its implications for concepts. Concepts and categories do not represent the world in the mind, as is generally assumed, but are a participating part of the mind-world whole. Therefore, they only occur as part of a web of meaning provided both by other concepts and by interrelated life activities. This means that concepts and categories exist only in concrete complex situations.
The three major types of concept and categories approaches (classical, graded structure, and theory) may now be examined in relation to situations and context: giving dictionary style definitions of concepts is the sine qua non of the classical view, but even such activity occurs in particular situations, in which the entire background of practices, understandings and teachings with which we have been raised come into play. Note that the very attributes used in this kind of definition of concepts are themselves concepts that can be pointers to affordances and life activities, and how they are organized and understood by the mind.
Prototypes both vary across situations and show inter-situation consistency. Such consistency is a clue to more general life activities. Prototypes with their rich non-criterial information and imagery can indicate, on many different levels, possible ways of situating oneself and navigating complex situations. Basic level object research (Rosch et al, 1976) indicated that category formation is not arbitrary but takes place in such a way as to maximally map the informational structure of the world. What is referred to as a basic level category such as chair seems more like the object’s real name than furniture because it categorizes the object at the level of detail that is most useful for conveying and interpreting meaning, given the forms of living in our culture – a level which would be expected to differ with age, expertise, social structure, and culture. If categories ultimately arise from life activities, basic level categories could provide an entry to the events and processes that produce them. And as a worldview builds up from basic level categories to include more detailed as well as more abstract levels of conceptualizing, it becomes more interconnected, more of an ecology, that comes increasingly to reflect what is unique about the circumstances and idiosyncrasies of the individual.
The examples used in theories arguments typically point to situational variations. For example, a drink might mean beer in the context of truck drivers, milk in the context of a school lunch, and wine in the context of a dinner party. This is attributed to the theories that we have about these matters. We argued earlier that the word theory may be little but a place holder for an explanation that is still forthcoming; here we can see how it might point toward the life activities that give rise to inter-situational consistencies. The question to ask with respect to all three views of categories is: what are the relations between perceptual, functional, and causal properties in concrete real-world life situations that are searched out by individual learners and honed in on by the languages and cultures of the world to form maximally useful and meaningful categories?
We have examined the merits and pitfalls of several approaches to concepts, focusing on how they combine and change under the influence of a context. Let us now summarize what we see as the major unsolved issues to be accounted for by a theory of concepts.
We have seen that the situation or context influences the meaning in the concept, and for this reason we need to give place to the context in the description of a concept. It is however impossible to circumscribe in advance the diverse situations to which a given concept will be applied, and the unique slants it can be given in unexpected circumstances. For this reason, many express the concern that current theories of concepts get us no closer to understanding the contextual manner in which concepts are actually evoked and used in everyday life (Gerrig & Murphy, 1992; Hampton, 1997; Komatsu, 1992; Medin & Shoben, 1988; Murphy & Medin, 1985; Rosch, 1999).
The problem is analogous, indeed virtually identical, to arguments about whether or not and in what way one needs to include world knowledge in formal semantic models (Fodor, 1998; Rips, 1995). The paradox in most works on this topic is the tacit recognition that it is both necessary and impossible to include such knowledge. Rips (1995), for example, claims: “...part of the semantic story will have to include external causal connections that run through the referents and their representations” (p. 84), but in the same work asserts with his No Peeking Principle that we cannot be expected to incorporate into a theory of concepts how they interact with world knowledge (what Hampton, (1997), refers to as extensional feedback and Searle, (1993), expresses in his ceteris paribus argument). The idea that this is impossible stems from the fact that one cannot incorporate into a model of concepts how a concept would manifest in every possible context. An accurate mathematical description of the concept SCREWDRIVER for example, would have to incorporate not only the most typical attributes of SCREWDRIVER, nor even attributes of SCREWDRIVER that are occasionally present, but attributes of SCREWDRIVER that might cause it to be elicited spontaneously in response to some unforeseen context. The concept SCREWDRIVER might be most commonly evoked by situations that involve typical SCREWDRIVER features such as the context of ‘tool’. However, a criminal might think of SCREWDRIVER in the context of ‘weapon’. And so forth.
Theories of concepts have been relatively successful at describing and predicting the results of cognitive processes involving relationships of cause and effect using artificial stimuli where the effects of background information and nuances of personal meaning are minimal. Difficulties arise when it comes to natural categories, one classic problem being what happens when concepts interact to form a conjunction, or in more complex sorts of combinations such as sentences. As many studies (e.g. Hastie et al., 1990; Kunda et al., 1990; Hampton, 1997) have shown, a conjunction often possesses features that are said to be emergent: not true of its constituents. For example, the properties lives in cage and talks are considered true of pet birds, but not true of pets or birds. Representational theories are not only incapable of predicting what sorts of features will emerge (or disappear) in a conjunction, they do not even provide a place in the formalism for the gain (or loss) of features. This problem is hinted at by Boden (1990), who uses the term impossibilist creativity to refer to creative acts that not only explore the existing state space (set of all possible states) but transform that state space. In other words it involves the spontaneous generation of new states with new properties. One could try to solve the problem ad hoc by starting all over again with a new state space each time there appears a state that was not possible given the previous state space; for instance, whenever a conjunction like pet bird comes into existence. However, this happens every time one generates a sentence that has not been used before, or even uses the same sentence in a slightly different context. Another possibility would be to make the state space infinitely large to begin with. However, since we hold only a small number of items in mind at any one time, this is not a viable solution to the problem of describing what happens in cognition.
Theories of concepts often employ a notion of distance based on similarity in terms of shared features. Recognition of similarity and difference between things and responses to the things based on that recognition is a universal function of organisms. Behavior on the basis of similarity has been a basic principle in psychology from its earliest beginnings in associationism and Pavlovian conditioning up to its most current techniques in psychophysics. But no one has been able to define or explain similarity in a manner that cannot easily be struck down (Goldmeier, 1972; Tversky, 1977; see Medin, 1989, for a critique of Tversky). The problem is that definitions of similarity tend to be circular; items are defined as similar that are judged to be similar. Moreover, similarity-based theories of concepts have difficulty accounting for why items that are dissimilar or even opposite might nevertheless belong together; for example, why white might be more likely to be categorized with black than with flat, or why dwarf might be more likely to be categorized with giant than with, say, salesman. Wisniewski’s would answer that it is because DWARF and GIANT are ‘alignable’ with respect to the dimension of size, but his (1997a,b) dual process theory does not go the next step and show what kind of mathematical space concepts must ‘live’ in to spontaneously, with incomplete knowledge of one another, become aligned. We provide an approach to resolving this after introducing the relevant formalism.
To accomplish all that we expect of a theory of concepts, it must be a mathematical theory. The formalisms used have tended to be limited in scope or inappropriate to the general approach being tested; often they are examples of the tail wagging the dog. The ecological situational approach, though conceptually appealing, is challenging due to the lack of fixed reference points for concepts and the element of novelty and creativity in concepts that it encompasses. A mathematics entirely new to psychology is called for.
The state of concepts research today is in some ways reminiscent of that of quantum mechanics a century ago. Quantum mechanics was born as a discipline when experiments on micro-particles revealed, for the first time in history, a world that completely resisted description using the mathematics of classical mechanics that had been so successful until then. One point of similarity between quantum entities and concepts is that both differ from entities that can be described by classical physics, for which if a property is not actual then its negation is actual. If the property ‘not green’ is true of a particular ball, then the property ‘green’ is not true of that particular ball. However, for concepts, as in quantum mechanics, a property and its negation can both be potential. Thus for the concept ball, if nothing is specified for the colour, ‘green’ and ‘not green’ are both potential. One could refer to this as a problem of nonclassical logic for concepts.
A second similarity between the quantum entities and concepts is: much as properties of a quantum entity do not have definite values except in the context of a measurement, properties of a concept do not have definite applicabilities except in the context of a particular situation. In quantum mechanics, the states and properties of a quantum entity are affected in a systematic and mathematically well-modeled way by the measurement. Similarly, the context in which a concept is experienced inevitably colors how one experiences that concept. One could refer to this as an observer effect for concepts. We will show that a generalization of the mathematics of quantum mechanics can be used to describe the effect of context on concepts.
These problems – nonclassical logic and the
observer effect – generated in physics the need for a new kind of
probability model, i.e. a nonclassical probability model[2].
The only type of non-classical probabilities that are well known in nature are
the quantum probabilities. This suggests that to develop a theory of how
concepts interact with the inevitably incompletely specified contexts that
evoke them we should look to the quantum probability model, for such a theory
cannot be provided by approaches that assume a standard classical probability
model such as neural networks, Bayesian networks, and the formal models
discussed earlier in this paper. Indeed the spreading activation hypothesis
assumed in most such models is not supported empirically (Nelson et al., 2003, in press; Nelson & McEvoy, 2007). According to the
notion of spreading activation, activation travels through a fixed associative
network, weakening with conceptual distance. That is, it spreads from a target
concept to directly associated concepts, to less directly associated concepts,
and so forth, such that in order for the activation of the target to remain
strong there must be a return route for the activation to get back to the
target. Nelson and colleagues tested the classic spreading activation
hypothesis against another hypothesis they refer to (after a phrase coined by
Einstein) as the ‘spooky activation at a distance’ hypothesis. It predicts that
the target activates its network of associates in synchrony, and that each link
in the associative set contributes additively to the net strength of their
activation. In other words, activation strength is determined not by the spread
of activation but by the number and strength of links. A key difference between
these two hypotheses is that according to spreading activation, the activation
of the target will depend on how many associate-to-target links there are,
whereas according to spooky activation, activation of the target can be
strengthened by associate-to-associate links even in the absence of
associate-to-target links. They found that their experimental results supported
the spooky activation at a distance hypothesis and not the spreading activation
hypothesis.
An extreme kind of contextual interaction occurs when entities acting as contexts for one another influence each other to such an extent that the interaction results in a new entity with properties different from either of its constituents. Their degree of merger may thereafter be such that after the interaction one cannot manipulate one constituent without simultaneously affecting the other. Quantum mechanics provides a means describing such mergers as a compound of two entities. Two quantum entities can become entangled when they encounter one another, and in this new entangled state they behave as one quantum entity. A state of entanglement can be mathematically described using the tensor product. The tensor product always allows for the emergence of new states --- the entangled states --- with new properties. Specifically, if H1 is the Hilbert space describing a first sub-entity, and H2 the Hilbert space describing a second sub-entity, then the joint entity is described in the tensor product space H1 Ä H2. The formalisms developed to describe quantum phenomena have limitations that make their application specific to quantum mechanics. However, these formalisms have been generalized to apply to other situations exhibiting a similar kind of abstract structure, as discussed shortly. With respect to cognition, we can refer to this need to be able to describe what happens when concepts combine to become a single unit of meaning as an entanglement problem for concepts.
We now outline a theory of concepts that we believe can tackle the problems presented in the previous sections. With it one can begin to investigate the ecology of concepts: the structure of concepts, how they manifest in the context of other concepts, and how they participate in the world of which they are a part. A similar approach is being used to model contextual effects on word meanings (Bruza & Cole, 2005; Widdows, 2003; Widdows & Peters, 2003) and in decision making (Busemeyer et al. 2006; Busemeyer & Wang, 2007).
We noted that there exists a mathematical framework for describing the change and actualization of potentiality that results from contextual interaction in quantum mechanics, specifically one characterized by extreme susceptibility to change. A limitation of this framework is that it applies to the extreme case, when the response of the entity is maximally contextual. However, the development of generalizations of mathematical structures originally developed for quantum mechanics has provided tools to describe intrinsically contextual situations in not just quantum mechanics but other fields. In other words, these mathematical theories lift the quantum formalism out of the specifics of the microworld, making it possible to describe nondeterministic effects of context in other fields (Aerts, 1993; Aerts & Durt, 1994a, 1994b; Foulis & Randall, 1981; Foulis et al., 1983; Jauch, 1968; Mackey, 1963; Piron, 1976, 1989, 1990; Pitowsky, 1989; Randall & Foulis, 1976, 1978). The original motivation for these generalized formalisms was theoretical (as opposed to the need to describe the reality revealed by experiments) but (as is often the case) they have eventually been found to be useful in the description of real world situations.
The formalism we use is one of these generalizations. It is called the State Context Property (SCOP) formalism (described in detail in Gabora & Aerts, 2002a,b; Aerts & Gabora 2005a,b), an elaboration of the State Property formalism (Beltrametti & Cassinelli, 1981; Aerts, 1982, 1983, 1999, 2002). SCOP allows us to explicitly incorporate the context that evokes a concept and the change of state this induces in the concept into the formal description of a concept. With SCOP it is possible to describe situations with any degree of contextuality. In fact, classical and quantum come out as special cases: quantum at the one end of extreme contextuality and classical at the other end of extreme lack of contextuality (Piron 1976; Aerts 1983). The rationale for applying it to concepts is expressly because it allows incorporation of context into the model of an entity[3].
Using the SCOP formalism, a description of a concept consists of the five elements:
• A set S = {p, q, ...} of states the concept can
assume.
• A set M = {e, f, ...} of relevant contexts.
• A set L = {a, b, ...} of relevant properties
or features. (Note that contexts can be concepts, as can features.)
• A function n that describes the applicability or weight of a certain feature given a specific state and context. For
example, n(p, e, a) is the weight of feature a for the concept in state p under
context e.
• A function µ that describes the transition probability from one state to
another under the influence of a particular context. For example, µ(f, q, e,
p) is the probability that state p under the influence of context e
changes to the state q, giving rise to the new
context f.
An astute reader will recall that the entire set of relevant states and contexts is unlimited. Clearly it is not possible to incorporate all of them in the model, and moreover some degree of subjectivity is inevitable in the choice of relevant states and contexts. The model is an idealization of the concept. However, the more states and contexts included in the model, the richer it becomes[4]. What is important is that the potential to include this richness is present in the formalism; i.e. there is a place in it to include even improbable states, and largely but not completely irrelevant contexts. As a SCOP model grows to incorporate more and more concepts, the sets of states and contexts included in the model of any one particular concept will grow accordingly. Each concept (or constellation of concepts) can be considered a context (however unlikely)