Clancey,W.J. (1993) Notes on Heuristic Classification. Artificial Intelligence 59(1-2 February): 191-196, 1993. Special issue "Artificial Intelligence in Perspective"

Notes on

"Heuristic Classification"

William J. Clancey
Institute for Research on Learning
2550 Hanover Street
Palo Alto, CA 94304
 
 
Artificial Intelligence 59(1-2 February): 191-196, 1993. Special issue "Artificial Intelligence in Perspective"

.

2. Heuristic Classification.

Artificial Intelligence, 27, 289-350.

This paper began as a memo written at Teknowledge, Inc. in late 1983, stemming from discussions with Steve Hardy and Denny Brown about how to teach knowledge engineering. We based our courses on a library of "sample knowledge systems" and looked for patterns that could be taught as design principles. Discussion raged about competing knowledge representations: rule vs. frame languages, deep vs. shallow systems, classification vs. causal reasoning, first principles vs. case-based modeling. Customers and students at Teknowledge pressed us to relate our products and terminology to our competitors' (what marketing people call "tool comparison"). Hardy and Brown wanted to relate our example systems to the "representation, inference, and control" framework, which they preferred for describing reasoning. I wanted to convince the developers of Teknowledge's S.1 and M.1 why a representation language should incorporate classification primitives.

Analyzing expert systems and articulating patterns was part of my continuing effort (with Jim Bennett) to encourage Teknowledge's programmers to develop task-specific tools (Bennett, 1985). We realized by 1982 that the Neomycin approach of abstracting the diagnostic inference procedure from the domain model produced a reusable shell that included not just an inference engine, but a task-specific representation language and reasoning procedure (the diagnostic metarules). The idea of reasoning patterns was therefore in the air and suggested commonalities that recurred across domains. Notably, Newell's "Knowledge-Level" AAAI Presidential Address at Stanford in August 1980 (Newell, 1982) impressed upon us the importance of levels of description, and the need to move our descriptions from the implementation (rules vs. frames) to the conceptual level. I already had some experience in such cross-architecture comparisons (e.g., Section 7 of the Epistemology paper applied the Structure-Strategy-Support framework to six knowledge-based systems ranging from AM to Hearsay).

A few clarifications

The strength of the heuristic classification article may be its generality, for it can be something to everyone. But there have been a few important misinterpretations:

1) I strongly urged people not to view classification as an inherent property of problems. Classification is a method for constructing a situation-specific model; a given modeling purpose, such as diagnosis, might be accomplished in different ways, depending on the kind of model available. Problem types can be classified more usefully in terms of the purpose for constructing a model of some system in the world (i.e., the tasks of diagnosis, planning, control, repair, etc.).

2) I distinguished between analytic and synthetic tasks, which I perhaps unfortunately labeled "interpret/analysis" and "construct/synthesis" (Figures 5-1 and 5-2). A few readers confused these terms—which refer to analyzing an existing system in the world (e.g., predicting or explaining its behavior) or constructing a new system in the world (i.e., building it or repairing it)—with how the situation-specific model is inferred. The point of the article of course was to contrast inference by selection of models from a pre-enumerated classification with construction of models from structural and functional components related spatially, temporally, and causally. The typology of problem tasks refers to why the system is being modeled; the typology of inference methods refers to how the model is developed.

When writing the article, I was unsure how to contrast the process of constructing a line of reasoning in heuristic classification with the process of constructing a new model. Section 6.2.4 is one attempt, based on pre-enumerated versus new links between concepts. In "Model construction operators" (MCO) (Clancey, in press), I emphasize that construction of situation-specific model graphs is always occurring in expert systems; the distinction between Neomycin and Abel (for example) is what the nodes and links represent. For Neomycin, using heuristic classification, each node is a process description of the entire system being modeled (e.g., "there is a bacterial agent growing in the meninges of the central nervous system."). In Abel, which constructs a situation-specific model from primitives, the nodes represent physiological substances and processes within the system being modeled.

Redescribing by revisualizing

By the time I wrote MCO, I realized that many confusions about representations could be resolved if we see them as alternative perspectives on a single "virtual" formal system. In MCO, I relate node, path, subgraph, and graph views of inference (Section 3). This theme plays throughout my work, as I realized that representations and reasoning processes that were commonly viewed as different could be related by a shift in visualization:

1) Neomycin's causal network to classification inference appeared in Casnet (I missed this at first because I drew horizontally what Weiss and Kulikowski drew vertically);

2) Mycin's context tree constitutes a three-paneled blackboard (we missed this because we drew as a tree what other drew as layered boxes; we emphasized inheritance, they emphasized levels of description);

3) Neomycin's differential (a list of diseases) can be better represented as an explanation-proof tree, which we call the situation-specific model (we missed this because we thought Abel was doing "deep" causal modeling while Neomycin was only doing "shallow" classification, that is, not modeling at all);

Many other examples appear in the figures of MCO (e.g., Figure 17, 19, 27, 34, Table 5). Through this experience I developed the intuition that seemingly intractable debates about representations often stem from different visualizations or metaphors, and hence apparently incommensurable languages, not from inherent differences in the modeling methods of the programs being described. Bill Scherlis first impressed me with this possibility in 1979, when he argued that Mycin's rules could be expressed in predicate calculus, an idea that seemed sacrilegious at the time.

Revisualizing is reconceiving, not deriving, mapping, or compiling

Even when I realized that there were multiple perspectives for describing representations, I thought they must be derivable from each other, such as by a compilation process. For example, I had been asked by Keith Butler (at Boeing in 1984) to explain how common representational distinctions such as class/individual, type/subtype, and definition/schema relate to the heuristic classification framework. I found that these representational primitives play different roles: Definitions tend to be used for initial abstraction of data, schemas are used for heuristic association, and subtypes are used for both abstraction and refinement. Thus, we move from a concept or terminology-centered description of the knowledge base to a line-of-reasoning, data-to-solution perspective. When writing the article, I conceived of this analysis as "deriving" the horseshoe diagram from primitive relations (Section 4.4). But it is better to say that I shifted perspective from static concepts in isolation to how information about one concept is inferred from another (revealing an ordering of relations characterized as heuristic classification).

As I relate in my comments on the Epistemology paper (Clancey, this volume), attempting to generate patterns by reducing them to primitive representations is a powerful computational approach for constructing new and more generally useful process models. But it is also a scientific presumption about the nature of models and levels of description that is perhaps hampering our attempts to understand how people create and use representations (cf. Clancey, in preparation; Lave, 1988). In particular, a common assumption is that there is always one "correct" description of a system and alternative representations must be logically inferable from each other (cf. Schön's (1979) critique of analogical reasoning as structure mapping). This rests on the more general assumption that reality can be exhaustively modeled, with the idea that scientific laws (the "hidden truth of the matter") literally generate all the phenomena we observe. This parallels the prevalent belief of cognitive scientists that all human behavior is generated from a representational bedrock; in particular, tacit knowledge ("know how") must be compiled from representations. (Newell explicated this in his "brain as an orange" model, in which the "core" is knowledge compiled from a "rind" of production rules.) In effect, conflating knowledge, reality, and representations shaped the dilemmas of representational theory over the past few decades (Clancey, 1991a; b).

Lessons and impact

One idea in this article (Section 5) that I believe could be developed further is to integrate knowledge engineering with systems analysis. In MCO, I extend this argument to claim that AI programming should be conceived as a process-modeling technique, emphasizing qualitative or relational representations, as opposed to quantitative or numeric representations. Integrating these approaches to serve the needs of scientific and engineering modeling was at first obscured by the original emphasis that an expert system is necessarily related to how experts reason, by virtue of the knowledge "acquisition" process by which a human "transfers expertise" to the program. For example, even though we weren't interested in strictly modeling human reasoning, we were biased against using numeric models in expert systems because we believed classification and rule-based inference to be a better model of human knowledge than equation manipulation.

Emboldened by the publication of the Winograd and Flores (1986) book, I first presented these ideas at the Oregon State Knowledge Compilation and Banff Knowledge Acquisition Workshops in October 1986. I argued that it is more fruitful and appropriate to characterize a knowledge base as a model of some system in the world coupled with a task-specific reasoning procedure (Clancey, 1989a), and to not equate a representation of knowledge (the knowledge base) with knowledge, a capacity to behave (Clancey, 1991b). We should use whatever modeling techniques are useful for the problems at hand (Clancey, 1989b). Furthermore, we should recognize that the qualitative modeling techniques of AI programming have a generality and value that extends beyond their initial development for representing human beliefs and reasoning (MCO). Framing AI research methods in this way helps us understand why numeric representations (for example, certainty factors) seem to violate the rules of the game; also this view is important for not dismissing the value of schema models as situated cognition calls assumptions about knowledge representation into question (Clancey, 1991b).

By focusing on graph manipulation operators in MCO, I aim to squarely place knowledge engineering in the realm of computer programming and operations research. Today we are less prone to confuse means (building on people's existing language and models) with goals (constructing models in order to facilitate scientific prediction and experimentation, as well as the design and maintenance of complex engineering and organizational systems). Significantly, the "information for authors" of the Knowledge Acquisition journal now says, "The emphasis is not on artificial intelligence, but on the extension of natural intelligence through knowledge-based systems."

The heuristic classification paper helped move arguments about representations from the level of programming constructs (e.g., rules vs. frames) and conceptual networks (e.g., terminology classifications) to the level of recurrent abstractions in process modeling (e.g., kinds of taxonomies, how modeling tasks chain together). For example, the idea that causal inferences can feed into a classification, pioneered in Casnet and rediscovered in Neomycin, is now a commonplace modeling technique that can be taught explicitly to knowledge engineers and used to structure knowledge acquisition tools. Other researchers have gone beyond my promissory notes to deliver a second generation of process modeling languages and tools (Alexander, et al., 1986; Breuker and Wielinga, 1985; Chandrasekaran, 1986; Gruber, 1989; Hayes-Roth, et al., 1988; McDermott, 1988; Musen, 1989; Steels, 1985; Stefik, in preparation). In many respects, the original hope behind my conversations with Steve Hardy and Denny Brown has been realized.

References

Alexander, J. H., Freiling, M. J., Shulman, S. J., Staley, J. L., Rehfuss, S., & Messick, M. 1986. Knowledge level engineering: ontological analysis. Proceedings from the National Conference on Artificial Intelligence, (pp. 963-968).

Bennett, J. S. 1985. ROGET: A knowledge-based consultant for acquiring the conceptual structure of an expert system. Journal of Automated Reasoning, 1: 49-74.

Breuker, J. and Weilinga, B. 1985. KADS: Structured knowledge acquisition for expert systems. In Second International Workshop on Expert Systems. Avignon.

Chandrasekaran, B. 1986. Generic tasks in knowledge-based reasoning: High-level building blocks for expert system design. IEEE Expert 1(3):23-29.

Clancey, W. J. 1989a. Viewing knowledge bases as qualitative models. IEEE Expert, (Summer 1989), 9-23.

Clancey, W. J. 1989b. The knowledge level reinterpreted: Modeling how systems interact. Machine Learning 4(3/4): 287-293.

Clancey, W.J. 1991a. The frame of reference problem in the design of intelligent machines. In K. vanLehn (ed), Architectures for Intelligence: The Twenty-Second Carnegie Symposium on Cognition, Hillsdale: Lawrence Erlbaum Associates, pp. 357-424.

Clancey, W. J. 1991b. Situated Cognition: Stepping out of Representational Flatland. AI Communications—The European Journal on Artificial Intelligence 4(2/3):109-112.

Clancey, W.J. (In press). Model construction operators. To appear in Artificial Intelligence.

Clancey, W.J. (this volume). Comments on "Epistemology of a rule-based expert system."

Gruber, T. 1989. Automated knowledge acquisition for strategic knowledge. Machine Learning 4(3/4):293-336.

Hayes-Roth, B., M. Hewitt, M. Vaughn Johnson, and A. Garvey. 1988. ACCORD: A framework for a class of design tasks. KSL Technical Report 88-19, Computer Science Department, Stanford University.

Lave, J. 1988. Cognition in Practice. Cambridge: Cambridge University Press.

McDermott, J. 1988. Preliminary steps toward a taxonomy of problem-solving methods. In S. Marcus (ed), Automating Knowledge Acquisition for Expert Systems, Boston: Kluwer Academic Publishers, pp. 225-256.

Musen, M. A. 1989. Automated support for building and extending expert models. Machine Learning 4(3/4), 347-375.

Newell, A. 1982. The knowledge level. Artificial Intelligence, 18(1):87-127, January.

Schön, D.A. 1979. Generative metaphor: A perspective on problem-setting in social policy. In A. Ortony (ed), Metaphor and Thought. Cambridge: Cambridge University Press. pp. 254-283.

Steels, L. 1985. Second generation expert systems. Journal of Future Generation Computer Systems, 1(4):213-237.

Stefik, M. (in preparation). Introduction to Knowledge Systems. Morgan-Kaufmann.

Winograd, T. and Flores, F. 1986. Understanding Computers and Cognition: A New Foundation for Design. Norwood: Ablex.