Though it has been out since June, I just today came across an interview with Tim Berners-Lee on the Semantic Web that was conducted by Andrew Updegrove for the Consortium Standards Bulletin. I highly recommend this piece for any interested in an insider’s view to the creation and use of the semantic Web. Here are some highlights. All are direct quotes from Berners-Lee.
Here are some excerpts relating to the vision of the semantic Web:
The goal of the Semantic Web initiative is to create a universal medium for the exchange of data where data can be shared and processed by automated tools as well as by people. The Semantic Web is designed to smoothly interconnect personal information management, enterprise application integration, and the global sharing of commercial, scientific and cultural data.
Many large-scale benefits are, not surprisingly, evident for enterprise level applications. The benefits of being able to reuse and repurpose information inside the enterprise include both for savings and new discoveries. And of course, more usable data brings about a new wave of software development for data analysis, visualization, smart catalogues… not to mention new applications development. The point of the Semantic Web is in the potential for new uses of data on the Web, much of which we haven’t discovered yet.
As for status of the initiative, Berners-Lee directly addresses some critics by emphasizing the importance of automated tools and not author tagging:
It’s not about people encoding web pages; it’s about applications generating machine-readable data on an entirely different scale. Were the Semantic Web to be enacted on a page-by-page basis in this era of fully functional databases and content management systems on the Web, we would never get there. What is happening is that more applications — authoring tools, database technologies, and enterprise-level applications — are using the initial W3C Semantic Web standards for description (RDF) and ontologies (OWL).
Berners-Lee goes on to say:
One of the criticisms I hear most often is, “The Semantic Web doesn’t do anything for me I can’t do with XML”. This is a typical response of someone who is very used to programming things in XML, and never has tried to integrate things across large expanses of an organization, at short notice, with no further programming. One IT professional who made that comment around four years ago, said a year ago words to the effect, “After spending three years organizing my XML until I had a heap of home-made programs to keep track of the relationships between different schemas, I suddenly realized why RDF had been designed. Now I used RDF and its all so simple — but if I hadn’t have had three years of XML hell, I wouldn’t ever have understood.”
Many of the criticisms of the Semantic Web seems (to me at least!) the result of not having understood the philosophy of how it works. A critical part, perhaps not obvious from the specs, is the way different communities of practice develop independently, bottom up, and then can connect link by link, like patches sewn together at the edges. So some criticize the Semantic Web for being a (clearly impossible) attempt to make a complete top-down ontology of everything.
Others criticize the Semantic Web because they think that everything in the whole Semantic Web will have to be consistent, which is of course impossible. In fact, the only things I need to be consistent are the bits of the Semantic Web I am using to solve my current problem.
The web-like nature of the Semantic Web sometimes comes under criticism. People want to treat it as a big XML document tree so that they can use XML tools on it, when in fact it is a web, not a tree. A semantic tree just doesn’t scale, because each person would have their own view of where the root would have to be, and which way the sap should flow in each branch. Only webs can be merged together in arbitrary ways. I think I agree with criticisms of the RDF/XML syntax that it isn’t very easy to read. This raises the entry threshold. That’s why we wrote N3 and the N3 tutorial, to get newcomers on board with the simplicity of the concepts, without the complexity of that serialization.
Some of the other insights in the interview is that early adoption is likely to be internally by enterprises on their intranets, that there will definitely be first-mover advantages for software applications that embrace RDF and OWL, and that a more widely embraced rules-based language (think of a successor to Prolog) may likely emerge.
Highly recommended reading!