In earlier posts I have put forward a vision for the semantic Web in the enterprise that has an extensible database supporting semi-structured data at its core with XML mediating multiple ingest feeds, interaction with analytic tools, and sending results to visualization and reporting tools.
This is well and good as far as it goes. However, inevitably, whenever more than one tool or semi-structured dataset is added to a system, it brings with it a different “view” of the world. Formalized and standardized protocols and languages are needed to both: 1) capture these disparate “views” and 2) provide facilities to map them to resolve data and schema federation heterogeneities. These are the roles of RDF and OWL.
Fortunately, there is a very active community with tools and insights for working in RDF and OWL. Stanford and UMBC are perhaps the two leading centers of academic excellence.
If you are not generally familiar with this stuff, I recommend you begin with the recent “Order from Chaos” from Natalya Noy of the Protégé group at Stanford Medical. This piece describes issues like trust, etc., that are likely not as relevant to application of the semantic Web to enterprise intranets as they are to the cowboy nature of the broader Internet. However, much else of this article is of general use to the architect considering enterprise applications.
To keep things simple and to promote interoperability, a critical aspect of any enterprise semantic Web implementation will be providing the “data API” (including extensible XML, and RDF and OWL) standards that govern the rules of how to play in the sandbox. Time spent defining these rules of engagement will pay off in spades in relation to any other appproach for multiple ingest, multiple analytic tools and multiple audiences, reports and collaboration.
Another advantage of this approach is the existence of many open source tools for managing such schema (e.g., Protégé) and visualization (literally dozens), among thousands of ontologies and other intellectual property.