Posts Tagged ‘R2RML’

W3C’s Semantic Web Activity Folds Into New Data Activity

rsz_w3clogoThe World Wide Web Consortium has headline news today: The Semantic Web, as well as eGovernment, Activities are being merged and superseded by the Data Activity, where Phil Archer serves as Lead.  Two new workgroups also have been chartered: CSV on the Web and Data on the Web Best Practices.

What’s driving this? First, Archer explains, the Semantic Web technology stack is now mature, and it’s time to allow those updated standards to be used. With RDF 1.1, the Linked Data Platform, SPARQL 1.1, RDB To RDF Mapping Language (R2RML), OWL 2, and Provenance all done or very close to it, it’s the right time “to take that very successful technology stack and try to implement it in the wider environment,” Archer says, rather than continue tinkering with the standards.

The second reason, he notes, is that a large community exists “that sees Linked Data, let alone the full Semantic Web, as an unnecessarily complicated technology. To many developers, data means JSON — anything else is a problem. During the Open Data on the Web workshop held in London in April, Open Knowledge Foundation co-founder and director Rufus Pollock said that if he suggested to the developers that they learn SPARQL he’d be laughed at – and he’s not alone.” Archer says. “We need to end the religious wars, where they exist, and try to make it easier to work with data in the format that people like to work in.”

The new CSV on the Web Working Group is an important step in that direction, following on the heels of efforts such as R2RML. It’s about providing metadata about CSV files, such as column headings, data types, and annotations, and, with it, making it easily possible to convert CSV into RDF (or other formats), easing data integration. “The working group will define a metadata vocabulary and then a protocol for how to link data to metadata (presumably using HTTP Link headers) or embed the metadata directly. Since the links between data and metadata can work in either direction, the data can come from an API that returns tabular data just as easily as it can a static file,” says Archer. “It doesn’t take much imagination to string together a tool chain that allows you to run SPARQL queries against ’5 Star Data’ that’s actually published as a CSV exported from a spreadsheet.”

Read more

Hadoop Meets Semantic Technology: Data Scientists Win

Hadoop is on almost every enterprise’s radar – even if they’re not yet actively engaged with the platform and its advantages for Big Data efforts. Analyst firm IDC earlier this year said the market for software related to the Hadoop and MapReduce programming frameworks for large-scale data analysis will have a compound annual growth rate of more than sixty percent between 2011 and 2016, rising from $77 million to more than $812 million.

Yet, challenges remain to leveraging all the possibilities of Hadoop, an Apache Software Foundation open source project, especially as it relates to empowering the data scientist. Hadoop is composed of two sub-projects: HDFS, a distributed file system built on a cluster of commodity hardware so that data stored in any node can be shared across all the servers, and the MapReduce framework for processing the data stored in those files.

Semantic technology can help solve many of the  challenges, Michael A. Lang Jr., VP, Director of Ontology Engineering Services at Revelytix, Inc., told an audience gathered at the Semantic Technology & Business Conference in New York City yesterday.

Read more

Transforming Relational Data to RDF – R2RML Becomes Official W3C Recommendation

World Wide Web Consortium LogoToday, the World Wide Web Consortium announced that R2RML has achieved Recommendation status. As stated on the W3C website, R2RML is “a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author’s choice.” In the life cycle of W3C standards creation, today’s announcement means that the specifications have gone through extensive community review and revision and that R2RML is now considered stable enough for  wide-spread distribution in commodity software.

Photo of Richard CyganiakRichard Cyganiak, one of the Recommendation’s editors, explained why R2RML is so important. “In the early days of the Semantic Web effort, we’ve tried to convert the whole world to RDF and OWL. This clearly hasn’t worked. Most data lives in entrenched non-RDF systems, and that’s not likely to change.”

Read more

R2RML and Direct Mapping: W3C Proposed Recommendations

Photo of Juan SequedaYesterday, the W3C announced the advancement to Proposed Recommendations of two Relational Database to RDF (RDB2RDF) documents: 1) R2RML: RDB to RDF Mapping Language and 2) A Direct Mapping of Relational Data to RDF. Additionally, two Working Group Notes were also published: R2RML and Direct Mapping Test Cases and RDB2RDF Implementation Report.

Given that a vast amount of data in enterprises and on the web resides in Relational Databases, it is paramount to have methods that expose relational data as RDF, in order for Semantic Web applications to interact with Relational Databases. The R2RML and Direct Mapping standards bridges this gap. Direct Mapping is an automatic default mapping and R2RML is a mapping language where users can customize the mappings. With these two standards, we will now be able to see more and more relational data in the Linked Data cloud and part of Semantic Web applications.

A little bit of RDB2RDF history

Tim Berners-Lee wrote a Design Issue on “Relational Database on the Semantic Web” dating back initially to 1998. During the 2000′s, several tools, such as R2O, D2RQ, Virtuoso RDF Views, Triplify, Ultrawrap, were built that would expose Relational Databases as RDF and even allow SPARQL to be executed directly on the relational database.

In October 2007, the W3C organized a workshop to discuss the interest of mapping relational databases to RDF: RDF Access to Relational Databases. The outcome of this workshop was the formation of the RDB2RDF Incubator Group in 2008. The objective of this group was to classify existing approaches to map relational databases to RDF and to then further decide if a standard was necessary. The Incubator Group had a face-to-face meeting in October 2008. The Incubator Group concluded its work with two deliverables: a Survey of Current Approaches for Mapping of Relational Databases to RDF and the RDB2RDF XG Final Report. The conclusion was to recommend the formation of a Working Group to standardize a mapping language.

Read more

Highlights from WWW 2012 Conference

Juan Sequeda photoThis year was the 21st World Wide Web Conference located in Lyon, France. This conference is a unique forum for discussion about how the Web is evolving. There were hundreds of talks over 3 days. Let me summarize some Semantic Web presentations I was able to attend.

NautiLOD

Programmers daily use the wget tool to specify and retrieve data on the Web. However, wget is limited since it cannot dig into the semantics of Web data to do the job. What if you were to add semantics to wget? This is the question that Valeria Fionda, Claudio Gutierrez and Giuseppe Pirró asked themselves. They took that question to the next level: imagine a semantic wget on top of Linked Data. They wanted to create a language to declaratively specify portions of the Web of Data, define routes and instruct agents that can do things for you on the Web. All this by exploiting the semantics of information (RDF data) found in online data sources. For example, find all the Wikipedia pages of directors that have been influenced by Stanley Kubrick and send them to my email; retrieving information about David Lynch from different information providers only gives a hint of what can be done. The researchers developed a simple, generic declarative language, NautiLOD and implemented it in swget (semantic wget). swget comes in two flavors: a simple command line tool (to give the Web back to users) and a GUI. This is not a fantasy anymore. Check it our for yourself (http://swget.wordpress.com).

Read more

What W3C’s R2RML and Direct Mapping Mean to Enterprise Data

Juan Sequeda photoI’m very happy to announce that the World Wide Web Consortium’s RDB2RDF Working Group, in which I participate as an Invited Expert,  has published two Candidate Recommendations: R2RML: RDB to RDF Mapping Language and A Direct Mapping of Relational Data to RDF. This has been a long road and we still have some ways to go. The standardization process goes back to the W3C Workshop on RDF Access to Relational Databases, which took place in October 2007. The W3C RDB2RDF Incubator Group followed afterwards. After almost 5 years, we are on track to have a standard. However, what is this standard bringing to the table?

Read more

Antidot’s Open Source db2triples Implements R2RML and Direct Mapping

Antidot, which makes the semantically-powered Information Factory and Antidot Finder Suite software, this month released its db2triples as open source component software, available here, which implements the W3C RDB2RDF Working Group’s proposed R2RML language and Direct Mapping, covered here.

Antidot, in fact, shared with the W3C its experience leveraging Direct Mapping and R2RML to, in just half a day, fetch information from hundreds of tables in a client’s Magento ecommerce database to transform it to a graph model. That’s normally a complex task, says Antidot founder and CEO Fabrice Lacroix, which would involve data transformation and database content indexing of an unknown database model. “No one [here at Antidot] knows the complex, dynamic data model from Magento, and it’s very difficult to reverse-engineer these sort of models,” he says.

“So with Direct Mapping and R2RML it is very easy to go directly from the database to the graph you need…and then extract just the business objects we need. We did it in just half a day. Imagine that. For such complex stuff that’s a very short timeframe.” Lacroix says that the company thought it only fair, after that success, to send something back to the community.

Read more

Catching Up With the W3C And Its Focus On the Enterprise

Underway at the W3C are some major Semantic Web efforts that can have a big impact on the enterprise.

One of these is about mapping relational databases to RDF. The RDB2RDF Working Group late last month issued a last call regarding the publication of its Direct Mapping and R2RML documents. As has been noted by Tim Berners-Lee, a big driving force for the Semantic Web has been the expression on the Web of the vast amounts of relational database information in a way that can be processed by machines. “We know that something like 80 or 90 percent of data published on the web comes from relational databases,” says Ivan Herman, W3C Semantic Web activity lead. “So it is important to make smooth the bridge between those two worlds, and this is what the working group was set up to do.”

Read more