Jack Schofield of ZDnet recently wrote, “HTML 5.1 is well under way and should become a Recommendation in 2016, and a first working draft of HTML 5.2 is expected next year. In sum, HTML 5 will continue for some time, and I don’t see any prospect of an HTML 6. However, there are clearly lots of things that don’t fit under the HTML 5 umbrella, even the broad version that subsumes separate but associated things like CSS. How will those be developed, and what will the new project be called?” Read more
Posts Tagged ‘Web of Data’
“There is nothing more difficult to plan, more doubtful of success, nor more dangerous to manage than the creation of a new order of things…. Whenever his enemies have the ability to attack the innovator, they do so with the passion of partisans, while the others defend him sluggishly, so that the innovator and his party alike are vulnerable.”
–Niccolò Machiavelli, The Prince (1513)
The Semantic Web is not here yet.
Additionally, neither are flying cars, the cure for cancer, humans traveling to Mars or a bunch of other futuristic ideas that still have merit.
A problem with many of these articles is that they conflate the Vision of the Semantic Web with the practical technologies associated with the standards. While the Whole Enchilada has yet to emerge (and may never do so), the individual technologies are finding their way into ever more systems in a wide variety of industries. These are not all necessarily on the public Web, they are simply Webs of Data. There are plenty of examples of this happening and I won’t reiterate them here.
Instead, I want to highlight some other things that are going on in this discussion that are largely left out of these narrowly-focused, provocative articles.
First, the Semantic Web has a name attached to its vision and it has for quite some time. As such, it is easy to remember and it is easy to remember that it Hasn’t Gotten Here Yet. Every year or so, we have another round of articles that are more about cursing the darkness than lighting candles.
In that same timeframe, however, we’ve seen the ascent and burn out failure of Service-Oriented Architectures (SOA), Enterprise Service Buses (ESBs), various MVC frameworks, server side architectures, etc. Everyone likes to announce $20 million sales of an ESB to clients. No one generally reports on the $100 million write-downs on failed initiatives when they surface in annual reports a few years later. So we are left with a skewed perspective on the efficacy of these big “conventional” initiatives.
Gabey Goh of the Malay Mail Online recently shared highlights from a lecture that Dame Wendy Hall gave at the University of Southampton’s Nusajaya Johor campus. Goh writes, “Science fiction author William Gibson famously defined ‘cyberspace’ as “a consensual hallucination experienced daily by billions of legitimate operators in every nation” in his 1984 book Neuromancer. However, this consensual hallucination that now underpins a very real economy, a digitally-driven network of information, goods and services, is not without its weaknesses. ‘The scary thing is that the Internet as we know it is still a baby and we have to take care of it. We created the Web, so it’s up to us as a society to look after it, to help make sure our governments or big companies not do things to it,’ Hall said. On-going issues with online social etiquette and cyber-bullying form part of the current crop of issues that can only be solved if we begin with education, Hall believes.” Read more
If the spreadsheet is the window into enterprise data, what is the window into the emerging web of data? How to build that windows is the question Harish Kumar wanted to explore, and it’s the question that he would like to answer with the technology behind his startup Semgel.
“If you have disparate data sets distributed across the world, how to make it simple to gather that and start analyzing that,” Kumar says. The application from the new venture is based on semantic web technologies, at its core being capable of consuming RDF/OWL data of all kinds. Right now, it is showcasing the capability by letting users search crunchbase.com for tech companies, investors and people and instantly create databases by grabbing one or more of these entities.
Linked data is becoming even more interesting to the OCLC, a non-profit, membership, computer library service and research organization of 72,000 libraries in 170 countries and territories around the world. It’s named Richard Wallis — formerly of the U.K.’s Talis Linked Data and Semantic Web Technology company and one of our frequent Semantic Web Blog guest authors — to the position of Technology Evangelist.
The OCLC has as a major asset Worldcat, a global catalog comprising the collections of more than 10,000 libraries and adding up to more than 258 million records and 1.8 billion-plus holdings, in traditional library metadata format. WorldCat.org is the publicly searchable view of their core data in library format based upon library records (Marc records). More semantic web-oriented is other work the OCLC been doing over the last couple of years, Wallis explains, including experiments with using RDF/Linked Data at viaf.org, where the Virtual International Authority File publishes authoritative descriptions of names or organizations, and something similar for the Dewey Decimal Classification system at dewey.info.
In his new role, Wallis will collaborate with members and facilitate projects with OCLC teams as libraries and the cooperative drive efforts to expose WorldCat data as linked data, and will represent OCLC and WorldCat to the global library and web/IT leader communities. The VIAF and Dewey projects certainly provided an opportunity for OCLC to see the benefit of linking things together. On top of that, “the climate for Linked Data and libraries has changed dramatically over the last 12 months,” Wallis says.
Interest was evident at the Linked Data in Libraries event he ran for Talis this past summer, for example, and efforts like the W3C’s Linked Data in Libraries interest group, the Linked Open Data in Libraries, Archives & Museums work, the British Library’s work on the British National Bibliography as Linked Open Data, and the Library of Congress’s Bibliographic Framework Initiative General Plan all are adding fuel to the fire.
The opportunity is there for the OCLC to take the lead on Linked Data in the somewhat fragmented library world as those organizations start to hear more and more about the concept. “Linked Data is starting to be something talked about in the library world, but like any other world, it’s still a bit of an enthusiast environment,” Wallis says. As he evangelizes to the library community what Linked Data is about – and to the web community about what the OCLC is doing with its chunk of data that is relevant to the wider Linked Data and Web of Data world – he hopes “to be in at the beginning of a process where those two communities come together to help come up with the best way of applying Linked Data principles to library data.”
In a statement announcing the appointment, Robin Murray, OCLC Vice President, Global Product Management, said, “Richard Wallis is a leader in Semantic Web and Linked Data technology, and we believe he will help the OCLC cooperative extend our efforts to help libraries move to Webscale.”
Data Liberate, the consultancy Wallis began upon leaving Talis, will continue as a personal blogging site. “I still have interest wider than the library community and I believe that those interests can keep me up to date with the wide world and advise my advice into the OCLC,” he says.
James Hendler was recently interviewed regarding the state of the World Wide Web and advances in semantic technology. Asked about the proliferation of the web, Hendler commented, “The Web is changing very fast and it has a very rapid effect on our economy. Consider something like aeroplanes which, as a subject, has been studied all along. On the contrary, the Web has happened so fast and hit so many places that we never really had time to understand it. Many of the periodical works on the Web are being done on the data collected in 1999. In 1999 Facebook didn’t exist. Twitter didn’t exist. A lot of people study Twitter. But again that is just one thing. Wikipedia has been successful, while most ‘wikis’ have failed. Online, we are now discovering the power of the (individual’s) voice and governments do not know how to deal with it.” Read more
It cannot be denied that Stephen Wolfram knows data. As the person behind Mathematica and Wolfram|Alpha, he has been working with data — and the computation of that data — for a long time. As he said in his blog yesterday, “In building Wolfram|Alpha, we’ve absorbed an immense amount of data, across a huge number of domains. But—perhaps surprisingly—almost none of it has come in any direct way from the visible internet. Instead, it’s mostly from a complicated patchwork of data files and feeds and database dumps.”
The main topic of Wolfram’s post is a proposal about the form and placement of raw data on the internet. In the post, he proposes that .data be created as a new generic Top-Level Domain (gTLD) to hold data in a “parallel construct.”
Day 4 of ISWC 2011 was the second full day of the conference and started out with a keynote from Frank van Harmelen, titled “10 Years of Semantic Web: does it work in theory?” There were several sessions on RDF Querying of Multiple Sources, RDF Data Analysis, Formal Ontology & Patterns, Knowledge Representation Semantics, Web of Data, MANCHustifications and Provenance, the In Use track on Environmental data, the Semantic Web Challenge and a very exciting Deathmatch panel.
The main question addressed in the keynote was if a decade of Semantic Web work has helped to discover any Computer Science laws? Frank stated that what has been built in the past 10 years can be characterized in 3 parts:
The holy grail of the Semantic Web is to have intelligent agents that will be able to do all types of stuff for us, similar to what Siri is starting to do. Imagine my Semantic Web agent knows that I’ll be traveling to Bonn, Germany and will make a reservation at a restaurant that it thinks that I would like and that a friend has recommended. Theoretically, this is possible if all the data on the Web was published as Linked Data. Just imagine TripIt data linked to Facebook and to DBpedia which in turn is linked to Yelp and OpenTable. My Semantic Web agent would be able to query all of this data together and pull it off. Technically, the technology exists to allow this to happen. The only things that are missing are:
- data published as Linked Data on the Web, including links between data from different sources, and
- a way to query everything together. I’m personally excited about the second issue: querying the Web as if it were a gigantic database.
AKSW has announced the latest release of their project, LIMES, “a link discovery framework for the Web of Data. It implements time-efficient approaches for large-scale link discovery based on the characteristics of metric spaces. It is easily configurable via a web interface. It can also be downloaded as standalone tool for carrying out link discovery locally.” According to the AKSW blog, “We could not resist the pleasure of making the demo of the new release candidate of LIMES (0.5RC1) available for all. LIMES 0.5 comes fitted with a new grammar for complex metric specification and fully novel algorithms.” Read more
NEXT PAGE >>