So, what IS Linked Data?

Linked Open Data Cloud

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

I don’t know if you have realized, but the web is evolving from a global information space of interlinked documents into a global information space of interlinked data. But what does that even mean? Consider a database that you have in your company or on your website. It consists of tables and attributes with data in them. You can group all the related data in a table and also link it to other data residing in another table through a foreign key. You are also able to query your database in order to get the exact information you desire. This is an information space.

Dynamic webpages are backed up by databases. That means that the data in your database feeds the content of your webpage (CMS, blog, etc). However, webpages do not exist by themselves on the web. When we create a webpage, we normally create hyperlinks to other webpages. This allows a user to read the content, and if desired, click on a link and it will send them off to another page. The beauty of the web is that anybody can say anything they want about anything. In other words, I can create a link to another webpage; that can be residing on a different server; on the other side of the world; which I have no control of. Having the ability to do that makes the web a global information space of interlinked documents.

So, what’s the deal with this global information space of interlinked data? Let’s go back to your database and consider that you have a table with all your customer information which includes the city in which they are located. You would like to ask the following query:

“Return all the customers who are in cities where the population is greater than 1 million.”

Easy, right? There is a small problem: you do not have the population of each city in your database! One solution is to find an existing database that has this information and add it to your database. However, this may not be an appealing thing to do because you would need to alter database.

Remember the “global information space of interlinked documents” where a user could simply create a link to another webpage, even if the user did not have any control of that other webpage. So could we do something similar in this case? Assume there is a database of cities and their populations. Couldn’t we create links from our database directly to the other database? It would be like creating a foreign key from my table in my database to a completely different table in a different database, which I have no control of. And this is obviously impossible. So what could we do?

This is where Linked Data comes in! The same way you can create a link to a webpage that you have no control of, with Linked Data, you can create links to data residing in other databases on the web. Linked Data transcends the physical barriers of machines! How is this done? Simple! The Linked Data principles consist of 4 steps:

  1. Use URIs to name things: every single record in your database will have a URI which will be its name. Consider it as a globally unique primary key.
  2. Use HTTP URIs so that people can look up those names: if you type in that URI in your browser, you should be able to get back information.
  3. When somebody looks up a URI, provide useful information using the standards (RDF): for a URI that identifies a record in your database, you could return all the values in that record as RDF.
  4. Include links to other URIs, so that people can discover things: This is where the magic happens. You can have internal links to URIs under your control, or you can simply add links to other data on the web.

Don’t confuse Linked Data with Linked Open Data. The latter applies the Linked Data principles to open data. Not all data has to be open and public. Therefore, you can apply these same principles to your internal enterprise data. Linked Data allows data residing in different databases to be linked, and therefore integrated. You can now query your integrated databases using SPARQL, the query language for RDF.

Likewise, by publishing data on the web following the Linked Data principles, the web appears to be as one global giant database. With SPARQL, we can now literally start to query the whole web as if it were a database. How cool is that?!

Very cool and useful!!!!!!!!

Editor’s Note:  The Semantic Web and Linked Data are in some ways elegantly simple.  In other ways, they are quite complex.  Many explainations of what they are have been given - to different audiences and with varying degrees of success.  Often, I have heard people say that they needed to have these technologies explained several times before they had a clear “aha” moment.  And those of us doing the explaining refine our explanations over time.

I’d like to welcome blogger Juan Sequeda, as he takes on the challenge of explaining Linked Data in his first post.  I have always appreciated Juan’s easy-to-understand communication style rooted in strong technical knowledge.  In the next year, we will hear more from him, and I look forward to that.  We will also publish more “Introduction To” pieces like this one.  Let us know what you think! –Eric Franzon