Despite page ranking and other techniques, the scale of the Internet is straining available commercial search engines to deliver truly relevant content. This observation is not new, but its relevance is growing. Similarly, the integration and interoperabillity challenges facing enterprises have never been greater. One approach to address these needs, among others, is to adopt semantic Web standards and technologies.
The image is compelling: targeted and unambiguous information from all relevant sources, served in usable bit-sized chunks. It sounds great; why isn’t it happening?
There are clues — actually, reasons — why semantic Web technology is not being embraced on a broad-scale way. I have spoken elsewhere as to why enterprises or specific organizations will be the initial adopters and promoters of these technologies. I still believe that to be the case. The complexity and lack of a network effect ensure that semantic Web stuff will not initially arise from the public Internet.
Parellels with Knowledge Management
Paul Warren, in “Knowledge Management and the Semantic Web: From Scenario to Technology,” IEEE Intelligent Systems, vol. 21, no. 1, 2006, pp. 53-59, has provided a structured framework for why these assertions make sense. This February online article is essential reading for anyone interested in semantic Web issues (and has a listing of fairly classic references).
If you can get past the first silly paragraphs regarding Sally the political scientist and her research example (perhaps in a separate post I will provide better real-world examples from open source intelligence, or OSINT), Warren actually begins to dissect the real issues and challenges in effecting the semantic Web. It is this latter two-thirds or so of Warren’s piece that is essential reading.
He does not organize his piece in the manner listed below, but real clues emerge in the repeated pointing to the need for “semi-automatic” methods to make the semantic Web a reality. Fully a dozen such references are provided. Relatedly, in second place, are multiple references to the need or value of “reasoning algorithms.” In any case, here are some of the areas noted by Warren needing “semi-automatic” methods:
- Assign authoritativemenss
- Learn ontologies
- Infer better search requests
- Mediate ontologies (semantic resolution)
- Support visualization
- Assign collaborations
- Infer relationships
- Extract entities
- Create ontologies
- Maintain and evolve ontologies
- Create taxonomies
- Infer trust
- Analyze links
- etc.
These challenges are not listed in relevance, but as encountered in reading the Warren piece. Tagging, extracting, classifying and organizing all are pretty intense tasks that certainly can not be done solely manually while still scaling.
Keep It Simple, Stupid
The lack of “simple” approaches is posited as another reason for slow adoption of the semantic Web. In the article “Spread the word, and join it up,” in the April 6 Guardian, SA Matheson reports Tim O’Reilly as saying:
“I completely believe in the long-term vision of the semantic web – that we’re moving towards a web of data, and sophisticated applications that manipulate and navigate that data web. However, I don’t believe that the W3C semantic web activity is what’s going to take us there….It always seemed a bit ironic to me that Berners-Lee, who overthrew many of the most cherished tenets of both hypertext theory and SGML with his ‘less is more
and worse is better’ implementation of ideas from both in the world wide web, has been deeply enmeshed in a theoretical exercise rather than just celebrating the bottom-up activity that will ultimately result in the semantic web…..It’s still too early to formalise the mechanisms for the semantic web. We’re going
to learn by doing, and make small, incremental steps, rather than a great leap forward.”
There is certainly much need for simplicity to encourage voluntary compliance with semantic Web potentials, short of crossing the realized rewards of broad benefits from the semantic Web and network effects. However, simplicity and broad use are but two of the factors limiting adoption, some of the others including incentives, self-interest and rewards.
As Warren points out in his piece:
Although knowledge workers no doubt believe in the value of annotating their documents, the pressure to create metadata isn’t present. In fact, the pressure of time will work in a counter direction. Annotation’s benefits accrue to other workers; the knowledge creator only benefits if a community of knowledge workers abides by the same rules. In addition, the volume of information in this scenario is much greater than in the services scenario. So, it’s unlikely that manual annotation of information will occur to the extent required to make this scenario work. We need techniques for reducing the load on the knowledge creator.
Somehow we keep coming back to the tools and automated ways to ease the effort and workflow necessary to put in place all of this semantic Web infrastructure. These aids are no doubt important — perhaps critical — but in my mind still short changes the most determinant dynamic of semantic Web technology adoption: the imperatives of the loosely-federated, peer-to-peer broader Web v. enterprise adoption.
Oligarchical (Enterprise) Control Preceeds the Network Effect
There are some analogies between service-oriented architectures and their associated standards, and the standards contemplated for the semantic Web. Both are rigorous, prescribed, and meant to be intellectually and functionally complete. (In fact, most of the WS** standards are specific SOA ones for the semantic Web.) The past week has seen some very interesting posts on the tensions between “SOA Versus Web 2.0?, triggered by John Hagel’s post:
. . . a cultural chasm separates these two technology communities, despite the fact that they both rely heavily on the same foundational standard – XML. The evangelists for SOA tend to dismiss Web 2.0 technologies as light-weight “toys” not suitable for the “real” work of enterprises. The champions of Web 2.0 technologies, on the other hand, make fun of the “bloated” standards and architectural drawings generated by enterprise architects, skeptically asking whether SOAs will ever do real work. This cultural gap is highly dysfunctional and IMHO precludes extraordinary opportunities to harness the potential of these two complementary technology sets.
This theme was picked up by Dion Hinchcliffe, among others. Dion consistently posts on this topic in his ZDNet Enterprise Web 2.0 and Web 2.0 blogs, and is always a thoughtful read. In his response to Hagel’s post, Hinchcliffe notes “… these two cultures are generally failing to cross-pollinate like they should, despite potentially ‘extraordinary opportunities.’.”
Supposedly, kitchen and garage coders playing around with cool mashups while surfing and blogging and posting pictures to Flickr are seen as a different “culture” than supposedly buttoned-down IT geeks (even if they wear T-shirts or knit shirts). But, in my experience, these differences have more to do with the claim on time than the fact we are talking about different tribes of people. From a development standpoint, we’re talking about the same people, with the real distinction being whether they are on payroll time or personal time.
I like the graphic that Hinchcliffe offers where he is talking about the SaaS model in the enterprise and the fact it may be the emerging form. You can take this graphic and say the left-hand side of the diagram is corporate time, the right-hand side personal time.
I make this distinction because where systems may go is perhaps more useful to look at in terms of imperatives and opportunities v. some form of “culture” clash. In the broad Web, there is no control other than broadly-accepted standards, there is no hegemony, there is only what draws attention and can be implemented in a decentralized way. This impels simpler standards, and simpler “loosely-coupled” integrations. We thus see mashups and simpler Web 2.0 sites like social bookmarking. The drivers are not “complete” solutions to knowledge creation and sharing, but what is fun, cool and gets buzz.
The corporate, or enterprise side, on the other hand, has a different set of imperatives and, as importantly, a different set of control mechanisms to set higher and more constraining standards to meets those imperatives. SOA and true semantic Web standards like RDF-S or OWL can be imposed, because the sponsor can either require it or pay for it. Of course, this oligarchic control still does not ensure adherence, just as IT departments were not able to prevent PC adoption 20 years ago, so it is important that productivity tools, workflows and employee incentives also be aligned with the desired outcomes.
So, what we are likely to see, indeed are seeing now, is that more innnovation and experimentation in “looser” ways will take place in Web 2.0 by lots of folks, many on them in their personal time away from the office. Enterprises, on the other hand, will take the near-term lead on more rigorous and semantically-demanding integration and interoperability using semantic Web standards.
Working Both Ends to the Middle
I guess, then, this puts me squarely in the optimists camp where I normally reside. (I also come squarely from an enterprise perspective since that is where my company resides.) I see innovation at an unprecedented level with Web 2.0, mashups and participatory media, matched with effort and focus by leading enterprises to climb the data federation pyramid while dealing with very real and intellectually challenging semantic mediation. Both ends of this spectrum are right, both will instruct, and therefore both should be monitored closely.
Warren gets it right when he points to prior knowledge management challenges as also informing the adoption challenges for the semantic Web in enterprises:
Currently, the main obstacle for introducing ontology-based knowledge management applications into commercial environments is the effort needed for ontology modeling and metadata creation. Developing semiautomatic tools for learning ontologies and extracting metadata is a key research area….Having to move out of a user’s typical working environment to ‘do knowledge management’ will act as a disincentive, whether the user is creating or retrieving knowledge…. I believe there will be deep semantic interoperability within organizational intranets. This is already the focus of practical implementations, such as the SEKT (Semantically Enabled Knowledge Technologies) project,
and across interworking organizations, such as supply chain consortia. In the global Web, semantic interoperability will be more limited.
My suspicion is that Web 2.0 is the sandbox where the tools, interfaces and approaches will emerge that help overcome these enterprise obstacles. But we will still look strongly to enterprises for much of the money and the W3C for the standards necessary to make it all happen within semantic Web imperatives.