Internet Splat MapThe holy grail of the Semantic Web is to have intelligent agents that will be able to do all types of stuff for us, similar to what Siri is starting to do. Imagine my Semantic Web agent knows that I’ll be traveling to Bonn, Germany and will make a reservation at a restaurant that it thinks that I would like and that a friend has recommended. Theoretically, this is possible if all the data on the Web was published as Linked Data. Just imagine TripIt data linked to Facebook and to DBpedia which in turn is linked to Yelp and OpenTable. My Semantic Web agent would be able to query all of this data together and pull it off. Technically, the technology exists to allow this to happen. The only things that are missing are:

  1. data published as Linked Data on the Web, including links between data from different sources, and
  2. a way to query everything together. I’m personally excited about the second issue: querying the Web as if it were a gigantic database.


I met Olaf Hartig at the Publishing Linked Data tutorial at the 2008 International Semantic Web Conference. He was closing the tutorial by demoing a way to consume Linked Data on the Web through SPARQL. It was that moment when it hit me: the Semantic Web makes the Web appear as one gigantic database. I worked with Olaf and we developed SQUIN, a query interface for the Web of Linked Data. Basically, you could write a SPARQL query and it gets executed over the Web, without defining any SPARQL endpoint. I demoed SQUIN at the 2009 Semtech Conference and it was well received even though it was preliminary research. Ever since, Olaf has been researching this topic.

I think it’s time that we start thinking crazy. We ask ourselves, what would it take to actually have a way to query the whole Web itself? This is not a new idea; several researchers have thought about this since the mid 90s. However, they didn’t have structured data on the Web at that time. We do now! Can we create a query system that is able to answer structured questions (queries)? The Yahoo Query Language is a start, but it only allows for querying of their cache data. I can’t query *everything*. What would it take to ask and to answer queries over the whole Web of Data itself?

Olaf and I have been thinking about this for a while and we wrote a vision paper about it. We highlight four different areas that researchers should think about and pose several open questions, but obviously no solutions. We want to share our vision with everybody and inspire research! We believe that we are 5-10 years away from making this possible. If you are curious, check out our vision paper: Towards a Query Language for the Web of Data.

Enjoy!