Better Mappings, More Properties
When we released KBpedia v 1.60 as open source a couple of weeks back, I noted that I would follow-up the announcement with more details on the changes made in preparation for the release. This post provides that update.
KBpedia is a computable knowledge structure that combines seven major public knowledge bases — Wikipedia, Wikidata, schema.org, DBpedia, GeoNames, OpenCyc, and UMBEL. KBpedia supplements these core KBs with mappings to more than a score of additional leading vocabularies. The entire KBpedia structure is computable, meaning it can be reasoned over and logically sliced-and-diced to produce training sets and reference standards for machine learning and data interoperability. KBpedia provides a coherent overlay for retrieving and organizing Wikipedia or Wikidata content. KBpedia greatly reduces the time and effort traditionally required for knowledge-based artificial intelligence (KBAI) tasks.
KBpedia is a comprehensive knowledge structure for promoting data interoperability and KBAI. KBpedia’s upper structure, the KBpedia Knowledge Ontology (KKO), is based on the universal categories and knowledge representation theories of the great 19th century American logician, philosopher, polymath and scientist, Charles Sanders Peirce. This design provides a logical and coherent underpinning to the entire KBpedia structure. The design is also modular and fairly straightforward to adapt to enterprise or domain purposes. KBpedia was first released in October 2016. My initial announcement provides further details on KBpedia and how to download it.
Besides prepping the KBpedia knowledge artifiact for open-source release, we also made these improvement to the base structure in comparison to the prior v 1.51, the last proprietary version:
- The major effort was to increase the mapping to Wikidata, with most mappings represented as
owl:equivalentClass
. Coverage of KBpedia to Wikidata is now 50%, with 27,423 of KBpedia’s reference concepts now mapped to Wikidata. Version 1.60 has 4.5x more coverage than the previous v. 1.51 - We also continued to increase coverage to Wikipedia, with coverage now at 77%
- We now have essentially complete coverage to DBpedia ontology, schema.org and GeoNames
- We doubled the number of mapped properties to nearly 5 K and added schema.org property mappings
- We organized the properties into attributes, indexes/indices, and external relations.
Please note we measure coverage as the larger of percent of external concepts mapped or percent of KBpedia mapped to the external source. The % Change figures represent the changes from v 1.51 to the new open source v 1.60.
Besides the property organization, we made few changes in this latest v 1.60 release to the overall structure or scope of KBpedia. The emphasis was on mapping to existing sources and clean up for public release. Here are the major statistics for v 1.60:
Structure | Value | % Change | Coverage | |
No. of RCs | 54,867 | 2.7% | ||
KKO | 173 | -0.6% | ||
Standard RCs | 54,694 | 2.7% | ||
No. of mapped vocabularies | 23 | -14.8% | ||
Core KBs | 7 | 16.7% | ||
Extended vocabs | 16 | -23.8% | ||
No. of typologies | 68 | 7.9% | ||
Core entity types | 33 | 0.0% | ||
Other core types | 5 | 0.0% | ||
Extended types | 30 | 20.0% | ||
No. of properties | 4,847 | 92.4% | ||
RC Mappings | 139,311 | 21.1% | ||
Wikipedia | 42,108 | 4.3% | 77% | |
Wikidata | 27,423 | 446.2% | 50% | |
schema.org | 845 | 15.1% | 99% | |
DBpedia ontology | 764 | 0.0% | 99% | |
GeoNames | 918 | 0.0% | 99% | |
OpenCyc | 33,526 | 0.0% | 61% | |
UMBEL | 33,478 | 0.0% | 99% | |
Extended vocabs | 249 | -4.2% | ||
Property Mappings | 4,847 | 92.4% | ||
Wikidata | 3,970 | 57.6% | ||
schema.org | 877 | N/A |
Through its mapped sources, KBpedia links to more than 30 million entities, the largest percentage coming from Wikidata. The mappings to these external sources are provided in the linkages to the external resources file in the KBpedia downloads. (A larger inferred version is also available.) The external sources keep their own record files. KBpedia distributions provide the links. However, you can access these entities through the KBpedia explorer on the project’s Web site (see these entity examples for cameras, cakes, and canyons; clicking on any of the individual entity links will bring up the full instance record.)
Please know that KBpedia remains under active development, with new updates anticipated in the near future. We are incorporating feedback gained from the initial open source release, and are also committed to increasing the mapping coverage for the artifact and other baseline improvements. Our plan is to complete this baseline before new external sources are added to the system.
KBpedia is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. KBpedia’s development to date has been sponsored by Cognonto Corporation.