At The Semantic Technology and Business conference in San Francisco Monday, OCLC technology evangelist Richard Wallis broke the news that Content-negotiation was implemented for the publication of Linked Data for WorldCat resources. Last June, WorldCat.org began publishing Linked Data for its bibliographic treasure trove, a global catalog of more than 290 million library records and some 2 billion holdings, leveraging schema.org to describe the assets.
“Now you can use standard Linked Data technologies to bring back information in RDF/ XML, JSON, or Turtle,” Wallis said. Or triples. “People can start playing with this today.” As he writes in his blog discussing the news, they can manually specify their preferred serialization format to work with or display, or do it from within a program by specifying to the http protocol for the format to accept from accessing the URI.
“Two hundred ninety million records on the web of Linked Data is a pretty good chunk of stuff when you start talking content negotiation,” Wallis told the Semantic Web Blog.
Linked Data was front and center in Wallis’ presentation about moving from records to the global Knowledge Graph. It’s a key marker on the road for libraries, archives and museums to once again become a – if no longer the – place for people who now turn to sources like Google, open government data sets, and data-savvy media outlets like the BBC to get information.
“The world is moving in the Linked Data direction but most libraries are still creating MARC records,” he said. “The web is already moving on without libraries. So to be a part of how the world is operating you have to put data in the same format. This is an opportunity to reach for together, to reassert our role as a resource when people are looking for information. We will never get our monopoly back but we should become one of the group of [resources] you go.”
The work WorldCat has so far done on the Linked Data front is starting to have an effect, if only just visible, as he put it, in the more obscure pieces of information. For instance, drawing on WorldCat’s publishing of Linked Data using the schema.org vocabulary, the main Google index now is starting to show up information about technical and academic works that until recently “you’d only find in Google Scholar or Google Book Search,” he said. “But how many people know to go there to search for something?”
With WorldCat results starting to appear within the main search index of Google, students and others at least have an indirect route to the libraries that can support their research needs. “If you are to effectively advertise your resources you want to use the most obvious language and Google, Bing, Yahoo and Yandex don’t understand MARC,” he said.
Wallis explained to attendees that schema.org is “almost good enough” for sharing library data with the world but should really hit the mark when the W3C Schema Bib Extend Community Group he chairs makes its proposed enhancements to the schema.org vocabulary. Ideally that will be sooner rather than later, as he wants the group to be short-lived so that the library community can move ahead with greater speed than it generally is used to. “The objective is to be visible on the web of data, to announce to the world that you have a particular resource,” he said. “You as an individual university or public library have to stand up on the web of data and say, ‘I’ve got one of those and it’s the same as everyone else who has one of these.’”