Linked Data

The Oxford Dictionary of National Biography Turns Ten

odnbDavid Hill Radcliffe of the OUPblog recently wrote, “The publication of the Oxford Dictionary of National Biography in September 2004 was a milestone in the history of scholarship, not least for crossing from print to digital publication. Prior to this moment a small army of biographers, myself among them, had worked almost entirely from paper sources, including the stately volumes of the first, Victorian ‘DNB’ and its 20th-century print supplement volumes. But the Oxford DNB of 2004 was conceived from the outset as a database and published online as web pages, not paper pages reproduced in facsimile. In doing away with the page image as a means of structuring digital information, the online ODNB made an important step which scholarly monographs and articles might do well to emulate.” Read more

Want a music suggestion? Just ask DJ Twitter.

screen shot of a seevl hack in action via Alexandre Passant.Alexandre Passant, founder of seevl, which we have covered before, has hacked together a cool proof of concept. He describes the project as using “Twitter As A Service,” and it leverages Twitter, YouTube, and the seevl API. As Passant describes, “The result is a twitter bot, running under our @seevl handle, which accepts a few (controlled) natural-language queries and replies with an appropriate track, embedded in a Tweet via a YouTube card.”

He continues, “As it’s all Twitter-based, not only you can send messages, but you can have a conversation with your virtual DJ.”

Read more

You Can Help Make Linked Data Core To The Future of Identity, Payment On The Web Platform

ld1At the end of September, The World Wide Web Consortium (W3C) may approve the world’s first Web Payments Steering Group, to explore issues such as navigating around obstacles to seamless payments on the web and ways to better facilitate global transactions while respecting local laws. Identity and digital signatures have a role here, at the same time as they go beyond the realm of payment into privacy and other arenas. At the end of October, there also will be a W3C technical plenary, to discuss identity, graph normalization, digital signatures and payments technologies.

Expect Linked Data to come up in the context of both events, Manu Sporny told attendees at this August’s 10th annual Semantic Technology & Business conference in San Jose during his keynote address, entitled Building Linked Data Into the Core of the Web. “It is the foundational data model to build all this technology off of,” said Sporny, who is the founder and CEO of Digital Bazaar, which develops technology and services to make it easier to buy and sell digital content over the Internet. (See our stories about the company and its technology here.)  He also is founder and chair of the W3C Web Payments Community Group, chair of its RDFa Working Group, and founder, and chair and lead editor of the JSON-LD Community Group.

Read more

Deconstructing JSON-LD

Photo of (clockwise from top-left: Aaron Bradley, Greg Kellogg, Phil Archer, Stephane CorlosquetAaron Bradley recently posted a roundtable discussion about JSON-LD which includes: “JSON-LD is everywhere. Okay, perhaps not everywhere, but JSON-LD loomed large at the 2014 Semantic Web Technology and Business Conference in San Jose, where it was on many speakers’ lips, and could be seen in the code examples of many presentations. I’ve read much about the format – and have even provided a thumbnail definition of JSON-LD in these pages – but I wanted to take advantage of the conference to learn more about JSON-LD, and to better understand why this very recently-developed standard has been such a runaway hit with developers. In this quest I could not have been more fortunate than to sit down with Gregg Kellogg, one of the editors of the W3C Recommendation for JSON-LD, to learn more about the format, its promise as a developmental tool, and – particularly important to me as a search marketer – the role in the evolution of schema.org.”

Read more

DBpedia 2014 Announced

DBpedia logoProfessor Dr. Christian Bizer of the University of Mannheim, Germany, has announced the release of DBpedia 2014. DBpedia is described at dbpedia.org as  “… a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data. We hope that this work will make it easier for the huge amount of information in Wikipedia to be used in some new interesting ways. Furthermore, it might inspire new mechanisms for navigating, linking, and improving the encyclopedia itself.”

The full announcement on the new release is reprinted below with Bizer’s permission.

****************

DBpedia Version 2014 released

1. the new release is based on updated Wikipedia dumps dating from April / May 2014 (the 3.9 release was based on dumps from March / April 2013), leading to an overall increase of the number of things described in the English edition from 4.26 to 4.58 million things.

2. the DBpedia ontology is enlarged and the number of infobox to ontology mappings has risen, leading to richer and cleaner data.

The English version of the DBpedia knowledge base currently describes 4.58 million things, out of which 4.22 million are classified in a consistent ontology (http://wiki.dbpedia.org/Ontology2014), including 1,445,000 persons, 735,000 places (including 478,000 populated places), 411,000 creative works (including 123,000 music albums, 87,000 films and 19,000 video games), 241,000 organizations (including 58,000 companies and 49,000 educational institutions), 251,000 species and 6,000 diseases. Read more

Nervana Systems Raises $3.3M for Deep Learning

nervanaDerrick Harris of GigaOM reports, “Nervana Systems, a San Diego-based startup building a specialized system for deep learning applications, has raised a $3.3 million series A round of venture capital. Draper Fisher Jurvetson led the round, which also included Allen & Co., AME Ventures and Fuel Capital. Nervana launched in April with a $600,00 seed round. The idea behind the company is that deep learning — the advanced type of machine learning that is presently revolutionizing fields such as computer vision and text analysis — could really benefit from hardware designed specifically for the types of neural networks on which it’s based and the amount of data they often need to crunch.” Read more

eXframe Platform Demos Power Of The Semantic Web For Biology

sccommonsA Drupal ++ platform for semantic web biomedical data – that’s how Sudeshna Das describes eXframe, a reusable framework for creating online repositories of genomics experiments. Das – who among other titles is affiliate faculty of the Harvard Stem Cell Institute – is one of the developers of eXframe, which leverages Stéphane Corlosquet’s RDF module for Drupal to produce, index (into an RDF store powered by the ARC2 PHP library) and publish semantic web data in the second generation version of the platform.

“We used the RDF modules to turn eXframe into a semantic web platform,” says Das. “That was key for us because it hid all the complexities of semantic technology.”

One instance of the platform today can be found in the repository for stem cell data as part of the Stem Cell Commons, the Harvard Stem Cell Institute’s community for stem cell bioinformatics. But Das notes the importance of the reusability aspect of the software platform to build genomics repositories that automatically produce Linked Data as well as a SPARQL endpoint, is that it becomes easy to build new repository instances with much less effort. Working off Drupal as its base, eXframe has been customized to support biomedical data and to integrate biomedical ontologies and knowledge bases.

Read more

A Venture Capitalist’s Take on the Internet of Things

photo of Nest ProtectDavid Hirsch, co-founder of Metamorphic Ventures, recently wrote for Tech Crunch, “There has been a lot of talk in the venture capital industry about automating the home and leveraging Internet-enabled devices for various functions. The first wave of this was the use of the smartphone as a remote control to manage, for instance, a thermostat. The thermostat then begins to recognize user habits and adapt to them, helping consumers save money. A lot of people took notice of this first-generation automation capability when Google bought Nest for a whopping $3.2 billion. But this purchase was never about Nest; rather, it was Google’s foray into the next phase of the Internet of Things.” Read more

XSB and SemanticWeb.Com Partner In App Developer Challenge To Help Build The Industrial Semantic Web

Semantic Web Developer Challenge - sponsored by XSB and SemanticWeb.comAn invitation was issued to developers at last week’s Semantic Technology and Business Conference: XSB and SemanticWeb.com have joined to sponsor the Semantic Web Developer Challenge, which asks participants to build sourcing and product life cycle management applications leveraging XSB’s PartLink Data Model.

XSB is developing PartLink as a project for the Department of Defense Rapid Innovation Fund. It uses semantic web technology to create a coherent Linked Data model for all part information in the Department of Defense’s supply chain – some 40 million parts strong.

“XSB recognized the opportunity to standardize and link together information about the parts, manufacturers, suppliers, materials, [and] technical characteristics using semantic technologies. The parts ontology is deep and detailed with 10,000 parts categories and 1,000 standard attributes defined,” says Alberto Cassola, vp sales and marketing at XSB, a leading provider of master data management solutions to large commercial and government entities. PartLink’s Linked Data model, he says, “will serve as the foundation for building the industrial semantic web.”

Read more

Google Releases Linguistic Data based on NY Times Annotated Corpus

Photo of New York Times Building in New York City

Dan Gillick and Dave Orr recently wrote, “Language understanding systems are largely trained on freely available data, such as the Penn Treebank, perhaps the most widely used linguistic resource ever created. We have previously released lots of linguistic data ourselves, to contribute to the language understanding community as well as encourage further research into these areas. Now, we’re releasing a new dataset, based on another great resource: the New York Times Annotated Corpus, a set of 1.8 million articles spanning 20 years. 600,000 articles in the NYTimes Corpus have hand-written summaries, and more than 1.5 million of them are tagged with people, places, and organizations mentioned in the article. The Times encourages use of the metadata for all kinds of things, and has set up a forum to discuss related research.”

The blog continues with, “We recently used this corpus to study a topic called “entity salience”. To understand salience, consider: how do you know what a news article or a web page is about? Reading comes pretty easily to people — we can quickly identify the places or things or people most central to a piece of text. But how might we teach a machine to perform this same task? This problem is a key step towards being able to read and understand an article. One way to approach the problem is to look for words that appear more often than their ordinary rates.”

Read more here.

Photo credit : Eric Franzon

<< PREVIOUS PAGENEXT PAGE >>