Support for Ontological Diversity and Evolution

Most people now understand the need for biodiversity, and many understand the need for cultural diversity, but it seems harder for ontologists and knowledge engineers to accept the need for concptual diversity. Many of them seem to believe in the possibility of a single unified ontology that attracts consensus because it "reflects the real underlying reality" of a domain. Yet there are numerous examples where it seems rather clear that consensus is unlikely to occur. Not least of these is the most classical, the systematic classification of plants and animals, which in Western culture goes back at least to Aristotle. Foundations of mathematics is another domain where consensus seems unattainable, with the long running battles between mainline set theorists and the several varieties of intuitionists showing no signs of being resolved. Evolutionr is an endemic and unending fact of life for classification systems that are in active use, especially in fast moving fields such as biology and information technology.

If the need for conceptual diversity is accepted, it then follows that knowledge engineering should seek ways to support it, rather than ways to overcome, suppress, or subvert it. I suggest that a change in terminology may help us to remember this point: instead of always speaking of "ontologies," a term that has a realist, Platonist connotation, let us sometimes speak of "theories," a term that has pragmatic, provisional, and evolutionary connotations, recognizing the contextual dependence of terminology. Ontologies written in most recent formal ontology languages are in fact theories in the technical sense of formal logic, i.e., they are sets of declarations and axioms in a precise logical system; these languages include OWL (in all its variants) and all description logics, since these have been given pecise semantics as formal logics in their specification documents.

There are many practical implications of this view, including that we should provide support for multiple evolving ontologies for single domains, and accept that translations among such theories will necessarily be partial and incomplete, and that we should provide tools to help construct such partial mappings. Such tools can also be used to identify subdomains where consensus is most likely to be achievable (these are areas where translation mappings are most well defined, i.e., least partial and incomplete), as well as used for constructing provisional supertheories that support discourse on areas of disagreement where useful new ideas might emerge (these are areas that have a rich mixture of implicit commonality and explicit disagreement).

Another important practical implication is that it is unlikely that ontologies will ever be able to provide all the background information needed to resolve data translation and integration problems. Difficulties here include not just multiple inconsistent ontologies, but also the vast array of factors that can impact choices of translation and integration functions, as well as the potentially very complex interactions among these factors. In fact, it seems that there is no feasible way even to enumerate all the potentially relevant factors and interactions, let alone to describe how they should all be handled in all possible situations.

Passing now to a more practical level, we can describe aspects of an architecture to support conceptual diversity and evolution with a relatively simple model, consisting (for a fixed relatively stable domain) of a core or kernel, theory C of commonly acknowledged concepts, and some given set of extensions of C, ei: C -> Ci. Note that these maps do not need to be inclusions: they might change names, or define new concepts in terms of core concepts, possibly even by explicitly denying the validity of certain concepts in C [in logic, these are called "theory interpretations"; the most general notion of such an extension is that of a so-called co-relation, which consists of a "mediating" theory Mi with inclusions ji: C -> Mi and j'i: Ci -> Mi]. We can then consider the possibility of further extensions of the Ci, representing fragmentation at a finer level of detail within particular subcommunities. There could also be more than one map between two given theories.

Finally, we can support the evolution of such diagrams of theories with a CVS system that includes mappings from old to new versions, where such mappings are morphisms of diagrams, composed of consistent sets of mappings of the component theories. We have developed a protocol (called the tatami protocol) to support a distributed CVS environment for such systems of theories. Maps of theories are useful for translating and integrating data and concepts, in ways that are already familiar in the database community (e.g., see the so called "model management" for schemas as promoted by Phil Bernstein and others); additional applications to the process of doing science were suggested above. Since the theories are likely to be evolving, if the maps are to be useful in these ways, they also should evolve, in which case one wants to update the maps, not start from scratch, and one wants the right version of the maps for one's particular situation. Hence a version and configuration manager and an incremental map editor will be very useful tools in practice.

Our group is excited by the facts that the essential theory for such theory morphisms, diagrams of theories and morphisms, and morphisms of such diagrams, is now well worked out and published, and that many essential aspects of a suitable mapping tool (namely, SCIA) are also now well understood and implemented.

by Joseph Goguen.
Last modified: Tue Nov 8 10:46:48 PST 2005