Significant Upgrades, Changes Based on Two Years of Use
Structured Dynamics and Ontotext are pleased to announce the latest release of UMBEL, version 0.80. It has been more than a year since the last update of UMBEL, and well past earlier announced targets for this upgrade. UMBEL was first publicly released as version 0.70 on July 16, 2008.
UMBEL (Upper Mapping and Binding Exchange Layer) has two roles. It is firstly a vocabulary for building reference ontologies to guide the interoperation of domain information. It is secondly a reference ontology in its own right that contains about 21,000 general reference concepts. With more than two years of practical experience with UMBEL, much has been learned.
This learning has now been reflected into five major changes for the system, embodying numerous minor changes. I summarize these major changes below. The formal release of UMBEL v. 0.80 is also being accompanied by a complete revamping and updating of the project’s Web site. I hope you will find these changes as compelling and exciting as we do.
In the broader context, it is probably best to view this release as but the interim first step of a two-step release sequence leading to UMBEL version 1.00. We are on track to release version 1.00 by the end of this year. This second step will include a complete mapping to the PROTON upper-level ontology and the re-organization and categorization of Wikipedia content into the UMBEL structure. We anticipate the pragmatic challenges in this massive effort will also inform some further refinements to UMBEL itself, which will also lead to further changes in its specification.
Nonetheless, UMBEL v. 0.80 does embody most of the language and structural changes anticipated over this evolution. It is fully ready for use and evaluation; it will, for example, be incorporated into a next version of FactForge. But, do be aware that the major revisions discussed herein are subject to further refinements as the efforts leading to version 1.00 are culminated over the next few weeks.
Let’s now overview these major changes in UMBEL v. 0.80.
Major Change #1: Clarification of Dual Role
The genesis of UMBEL more than three years ago was the recognition that data interoperability on the semantic Web depended on shared reference concepts to link related content. We spent much effort to construct such a reference structure with about 21,000 concepts. That purpose remains.
But, the way in which we created this structure — its vocabulary — has also proven to have value in its own right. The same basic way that we constructed the original UMBEL we have now applied to multiple, specific domain ontologies. With use, it has become clear that the vocabulary for creating reference ontologies is on an equal footing to the reference concepts themselves.
With this understanding has come clarity of role and description of UMBEL. With version 0.80, we now have explicitly split and defined these roles and files.
The UMBEL Vocabulary
Thus, UMBEL’s first purpose is to provide a general vocabulary (the UMBEL Vocabulary) of classes and predicates for describing domain ontologies, with the specific aim of promoting interoperability with external datasets and domains. It is usable exclusive of the UMBEL Reference Concept Ontology.
The UMBEL Vocabulary recognizes that different sources of information have different contexts and different structures. A meaningful vocabulary is necessary that can express potential relationships between two information sources with respect to their differences in structure and scope. By nature, these connections are not always exact. Means for expressing the “approximateness” of relationships are essential.
The vocabulary has been greatly simplified from earlier versions (see Major Change #2 below); it now defines two classes:
- RefConcept
- SuperType
These are explained further below. And, the vocabulary has 10 properties:
- isAbout
- isRelatedTo
- isLike
- hasMapping
- hasCharacteristic
- isCharacteristicOf
- preflabel
- altLabel
- hiddenLabel
- definition.
(Note, the latter four are also in SKOS; see [1].)
In addition, UMBEL re-uses certain properties from external vocabularies. These classes and properties are used to instantiate the UMBEL Reference Concept ontology (see next), and to link Reference Concepts to external ontology classes. For more detail on the vocabulary see Part I: Vocabulary Specification in the specifications.
The UMBEL Reference Concept Ontology
The second purpose of UMBEL is to provide a coherent framework of broad subjects and topics, the “reference concepts” or RefConcepts, expressed as the UMBEL Reference Concept Ontology. The RefConcepts act as binding nodes for mapping relevant Web-accessible content, also with the specific aim of promoting interoperability and to reason over a coherent reference structure and its linked resources. UMBEL presently has about 21,000 of these reference concepts drawn from the Cyc knowledge base, which are organized into more than 30 mostly disjoint SuperTypes (see Major Change #3).
The UMBEL Reference Concept Ontology is, in essence, a content graph of subject nodes related to one another via broader-than
and narrower-than
relations. In turn, these internal UMBEL RefConcepts may be related to external classes and individuals (instances and named entities) via a set of relational, equivalent, or alignment predicates (the UMBEL Vocabulary, see above).
The actual RefConcepts used are the least changed part in UMBEL from previous versions, and still have the same identifiers as prior versions. The Reference Concept Ontology now uses a recently updated release of the OpenCyc KB v3. Cycorp also added some additional mapping predicates in this release that allows items such as fields of study to be added to the structure. (Thanks, Cycorp!)
Here is a large-graph view of the 21,000 reference concepts in the ontology (click to expand; large file):
More detail on the RefConcepts is provided in Part II: Reference Concepts Specification of the full specifications.
Major Change #2: Reference Concepts and Predicate Simplification
Another set of major changes was the simplification and streamlining of the predicates and construction of the UMBEL Vocabulary [2]. Again, the specifications detail these changes, but the significant ones include:
Natural World | Natural Phenomena |
Natural Substances | |
Earthscape | |
Extraterrestrial | |
Living Things | Prokaryotes |
Protists & Fungus | |
Plants | |
Animals | |
Diseases | |
Person Types | |
Human Activities | Organizations |
Finance & Economy | |
Society | |
Activities | |
Time-related | Events |
Time | |
Human Works | Products |
Food or Drink | |
Drugs | |
Facilities | |
Human Places | Geopolitical |
Workplaces, etc. | |
Information | Chemistry (n.o.c) |
Audio Info | |
Visual Info | |
Written Info | |
Structured Info | |
Notations & References | |
Numbers | |
Descriptive | Attributes |
Classificatory | Abstract-level |
Topics/Categories | |
Markets & Industries |
- Changed the name of ‘Subject Concepts’ (SubjectConcept, or SC) to ‘Reference Concepts’ (RefConcept, or RC). The umbel:SubjectConcept class got deprecated, and the umbel:RefConcept class got added. As noted by many practitioners, the rather tortured use of the earlier “subject concepts” was questioned. The change in this new version reflects the actual reference use of the concepts and ontologies that employ them
- Dropped the “SemSet” class, and replaced the same idea of providing multiple tagging options via the best practice of the use of umbel:preLabel and multiple umbel:altLabels and umbel:hiddenLabels. This simplifies the language and brings usage into conformance with standard practice and reasoners
- With the addition of SuperTypes (see next Major Change), dropped the distinction for “abstract concepts” and rolled their earlier use into the standard RefConcepts
- The simplification due to OWL 2 metamodeling (see Major Change #4) enabled the removal of many earlier predicates and their inverse properties,
- With experience gained through linking datasets and their attributes to ontologies [3], added predicates (hasCharacteristic and isCharacteristicOf) for relating external properties, and
- Many other streamlining changes and improvements to property specifications.
See further the Part II in the full specifications.
Major Change #3: SuperTypes
Shortly after the first public release of UMBEL, it was apparent that the 21,000 reference concepts tended to “cluster” into some natural groupings. Further, upon closer investigation, it was also apparent that most of these concepts were disjoint with one another. As subsequent analysis showed, more fully detailed in the Annex G document, fully 75% of the reference concepts in the UMBEL ontology are disjoint with one another.
Natural clusters provide a tractable way to access and manage some 21,000 items. And, large degrees of disjointedness between concepts also can lead to reasoning benefits and faster processing and selection of those items.
For these reasons a dedicated analysis to analyze and assign all UMBEL reference concepts to a new class of SuperTypes was undertaken. SuperTypes are now a major enhancement to UMBEL v. 0.80. The assignment results and the SuperType specification are discussed in Part II, with full analysis results in Annex G.
In addition, all of these SuperTypes are clustered into nine “dimensions”, which are useful for aggregation and organizational purposes, but which have no direct bearing on logic assertions or disjointedness testing. These nine dimensions, with their associated SuperTypes, are shown in the table to the right. Note the last two dimensions (and four SuperTypes), shown in italics, are by definition non-disjoint.
The construct of the SuperType may be applied to any domain ontology constructed with the UMBEL Vocabulary. The UMBEL Reference Concept Ontology includes all disjoint assertions for all of its RefConcepts.
Major Change #4: OWL 2 Compliance
One of the most challenging improvements in the new UMBEL version 0.80 was to make its vocabulary and ontology compliant with the new OWL 2 Web Ontology Language. We wanted to convert to OWL 2 in order to:
- Use OWL reasoners
- Load the full UMBEL into the Protégé 4 ontology editor
- Use the OWL API, consistent with many other ontology tools we are pursuing, and
- Take advantage of a neat trick in OWL 2 called “punning“.
The latter reason is the most important given the reference role of UMBEL and ontologies based on the UMBEL Vocabulary. It is not unusual to want to treat things either as a class or an instance in an ontology. Among other aspects, this is known as metamodeling and it can be accomplished in a number of ways. “Punning” is one metamodeling technique that importantly allows us to use concepts in ontologies as either classes or instances, depending on context.
To better understand why we should metamodel, let’s look at a couple of examples, both of which combine organizing categories of things and then describing or characterizing those things. This dual need is common to most domains [4].
As one example, let’s take a categorization of apes as a kind of mammal, which is then a kind of animal. In these cases, ape is a class, which relates to other classes, and apes may also have members, be they particular kinds of apes or individual apes. Yet, at the same time, we want to assert some characteristics of apes, such as being hairy, two legs and two arms, no tails, capable of walking bipedally, with grasping hands, and with some being endangered species. These characteristics apply to the notion of apes as an instance.
As another example we may have the category of trucks, which may further be split into truck types, brands of trucks, type of engine, and so forth. Yet, again, we may want to characterize that a truck is designed primarily for the transport of cargo (as opposed to automobiles for people transport), or that trucks may have different drivers license requirements or different license fees than autos. These descriptive properties refer to trucks as an instance.
These mixed cases combine both the organization of concepts in relation to one another and with respect to their set members, with the description and characterization of these concepts as things unto themselves. This is a natural and common way to express most any domain of interest. It is also a general requirement for a reference ontology, as we use in the sense of UMBEL.
When we combine this “punning” aspect of OWL 2 with our standard way of relating concepts in a hierarchical manner, this general view of the predicates within UMBEL emerges (click to expand):
On the left-hand side (quadrants A and C) is the “class” view of the structure; the right-hand side is the “individual” (or instance) view of the structure (quadrants B and D). These two views represent alternative perspectives for looking at the UMBEL reference concepts based on metamodeling.
The top side of the diagram (quadrants A and B) is an internal view of UMBEL reference concepts (RefConcept) and their predicates (properties). This internal view applies to the UMBEL Reference Concept Ontology or to domain ontologies based on the UMBEL Vocabulary. These relationships show how RefConcepts are clustered into SuperTypes or how hierarchical relationships are established between Reference Concepts (via the skos:narrowerTransitive
and skos:broaderTransitive
relations). The concept relationships and their structure is a “class” view (quadrant A); treating these concepts as instances in their own right and relating them to SKOS is provided by the right-hand “individual” (instance) view (quadrant B).
The bottom of the diagram (quadrants C and D) shows either classes or individuals in external ontologies. The key mapping predicates cross this boundary (the broad dotted line) between UMBEL-based ontologies and external ontologies. See further Part I in the full specification for more detailed discussed of this figure and its relation to metamodelling.
Major Change #5: Documentation and Packaging
These changes also warranted better documentation and a better project Web site. From a documentation standpoint, the organization was simplified between the actual specifications and related annexes. Also, because of a more collaborative basis resulting from the new partnership with Ontotext, we also established an internal wiki following TechWiki designs. Initial authoring occurs there, with final results re-purposed and published on the project Web site.
The UMBEL Web site also underwent a major upgrade. It is now based on Drupal, and therefore will be able to embrace our conStruct advances in visualization and access over time. We also posted the full Reference Concept Ontology as an OWLDoc portal.
We feel these changes have now resulted in a clean and easy-to-maintain framework for the next phase in UMBEL’s growth and maturation.
Next Steps and Version
As noted in the intro, this version is but an interim step to the pending next release of UMBEL v. 1.00. This next version will provide mappings to leading ontologies and knowledge bases, as well as the upgrade of existing Web services and other language support features. Intended production or commercial uses would best await this next version.
However, the current version 0.80 is fully consistent and OWL 2-compliant. It loads and can be reasoned over with OWL 2 reasoners (see those available with Protégé 4.1, for example). We encourage you to download, test and comment upon this version. Specifics are:
- UMBEL Web site
- UMBEL Specifications (and Annexes A – G)
- Discussion Group
- Downloads and SVN
- UMBEL Vocabulary
- UMBEL RefConcepts.
As co-editors, Frédérick Giasson and I are extremely enthused about the changes and cleanliness of version 0.80. It is already helping our client work. We think these improvements are a good harbinger for UMBEL version 1.00 to come by the end of the year. We hope you agree.
(2) a:Harry rdf:type a:Eagle
(4) a:Eagle rdf:type a:EndangeredSpecies.