Berners-Lee Leads the Way on Linked Data
Jennifer Zaino
SemanticWeb.com Contributor
NEW YORK — At the LinkedData Planet Conference & Expo here this week, Sir Tim Berners-Lee talked about linking data using open standards, both in his keynote and in a press conference before his speech. Or not freeing data, as long as you make that decision because you don’t want to share the data, not because you can’t.
Berners-Lee made a number of interesting points around that idea, and commented on some other trends that will accompany the development of the semantic web. Here are some of the highlights of his analysis:
![]() |
On linking data using open standards in the enterprise
Most companies struggle with the problem of trying to look across disparate silos of information every time they have to react to an event. “Within a company, within the firewall, you need to be able to access [that data]. If I start off talking to a CIO about using open standards to interrelate with other companies — I get back, don’t even go there. Inside my company we have all these data pipes. We need to [link] it internally before we integrate it up and down the supply chain.
“Then you make a business decision about what data you want to share. That said, the technology doesn’t mean you are forced to share everything. You can keep a lot of stuff behind the firewall. But, you may find that business runs better if you do share a lot with your partners,” Berners-Lee said.
On a data browser
“We don’t have a generic data browser,” he said, noting, however, that MIT has a project called Tabulator, a Firefox extension that attempts to be a generic browser for linked data on the web. “Instead of looking at pages, it looks at things,” and what is related to them, finding patterns.
For example, you might find your town in DBpedia, and then a singer on Musicbrains who was born in that town, and then an album that singer has brought out, and then more singers with albums who hail from towns close to your own. “It lets you go between exploring the web and using it [like] a spreadsheet basis. That’s the sort of thing I’d like to be able to use as a generic data browser. When we have linked data, more and more we will find that a good user interface is a tremendous research challenge, and there will be a huge competitive market to allow a user to get the most of it.”
On the social, ethical consequences of proliferating personal data
Most people involved in the semantic web are building programs with a strong awareness of the issue of data trust, Berners-Lee said, noting that a lot of people are worried about the provenance of data and how much to trust it. But perhaps the bigger issue is that there are lots of cases where you can’t keep data locked up — for example, in its migration across social networking sites. So “social networking sites have to track what you wanted other people to use that data for. There has to be more accountability and more of a concept of acceptable use,” he said.
Ethics comes into play here. For example, say an individual puts their data out there about his trip on Route 66 across America; he should be able to specify that that data is for use by friends who want to see details about the trip, or students who are perhaps doing a project on Route 66. But “society has to move to an ethos where, if you are employing someone, you can’t pull together this data that on my trip I went from bar to bar,” and let that influence your hiring decision. Nor can an insurance company change your insurance premiums because they look at your GPS tags around your trip to conclude you are an alcoholic.
“There have to be changes in the attitude from trying to lock data down to building systems that are accountable, that track where it came from and where it’s going to,” he said, and how, or if, it is allowed to be used in each instance.
On who’s going to mark up data on the semantic web
“There’s a big misunderstanding that humans have to do the markup. The semantic web is about exposing the data, not marking up pages with the data,” he said. “With the semantic web, most of the data, the vast majority of it, is already in relational databases,” which can then be tuned to expose data as RDF and support SQL queries.
On how to move your company to the linked data and semantic web track
“The important thing is not to threaten people. You don’t go in there saying, “OK guys, you database people, I’ve had it up to here. You have till Friday, or I am putting [everything] in a triple store. This is not the way to approach database people. You have spent years giving them the ability to produce the data you need, setting up access to data warehouses, [etc.] You leave it running and doing everything and put in a little ‘shim’ to do a SPARQL query, and then you can start to do interesting value-added things. Then you can start sharing data with others, and start demanding from others that they share with you.”


The 
Eric Franzon
VP Community
Jennifer Zaino
Contributor
Angela Guess Contributor
semanticweb.com Twitter feed loading...