(By Ivan Herman and Manu Sporny)

[EDITOR’S NOTE: This guest post, by Ivan Herman and Manu Sporny, was originally published on w3.org and appears here courtesy of the W3C and the authors of the article.  SemanticWeb.com is a member of the W3C.]

The World Wide Web Consortium’s RDF Web Applications Working Group has published the first draft of the RDF Interfaces specification. This is a companion specification to the upcoming RDF API and the recently published RDFa API. Until recently the group publishing this specification was known as the “The RDFa Working Group”. After a need was identified for a common set of programming APIs for working with structured data on the Web, the RDFa Working Group was re-created as the RDF Web Applications Working Group. This article explains how each of the specifications that this group is producing work together to create a common Semantic Web publishing and development environment.

The Semantic Web has gained significant traction in the past few years. The buzz around this year’s Semantic Technologies conference, SemTech 2011, is a sign of the rapid growth of the Semantic Web. The amount of RDF data published on the Web is steadily growing thanks, for example, to the Linked Open Data movement, the eGovernmental initiatives, or the integration RDFa into content management systems and popular Web destinations. However, if one looks at the applications that make use of that RDF data, most of them can be characterized as “server-side”. The bulk of the work performed crawling, extracting, and processing the data is done behind the scenes. The Web browser is merely displaying information that has been created elsewhere. That is, the domain of the Semantic Web has not reached the world of “Web Applications”. Semantic Web applications running within the browser and written almost exclusively in Javascript are still few and far between. Web Applications can be seen as operating in a very different programming environment. Developers have their own development styles, architectures, and distinct communities. The Web Application development community, in general, strives for a much greater simplicity and a lower barrier-to-entry than client-side programmers and developers.

Various Working Groups at W3C, as well as developers and groups around the world, have been surveying this structured data landscape. In many ways, the development of RDFa and the successful deployment thereof, was the first step in this new world. It is also not a coincidence that there is an updated version of RDFa being finalized, taking into account the needs and desires of the general Web development community. The development of microformats and of microdata, though not closely bound to RDF, are also part of this landscape. There have also been passionate discussions on the pros and cons of a JSON serialization of RDF in W3C’s recently formed RDF Working Group. These discussions are still ongoing — see the Task Force’s Wiki page for further details. All of these communities were involved in the dialog and identification of a core missing component for Web Applications bound to the Semantic Web — an API to access RDF as well as structured data in general.

This leads to the inevitable question: What type of API should be defined? The Working Group has discussed this question for a long time. Should the API hide the complexities of RDF or not? Should it focus on people that know RDF deeply or on people that don’t know or care about RDF? After several attempts to answer each of these question, the group decided that a layered approach made the most sense. That is, we must recognize that there are several communities and we should provide for each of them, but in a way that builds into a unified whole. Hence the layered approach to the design of the RDF structured data APIs:

  1. The RDF Interfaces. A need was identified for a low-level API to expose RDF data to Javascript. It does not contain any new concepts or abstractions. It provides a straight-forward interface to RDF that those familiar with RDF will be comfortable with using. This is the lowest level in the stack, is called the RDF Interfaces specification, and is the document that was just published by the W3C.
  2. The RDF API. Once the basic layer is in place, one can envisage different libraries building on top of the RDF Interfaces. For example, a library for accessing SPARQL endpoints in JavaScript, much like Lee Feigenbaum’s sparql.js library. Other libraries could be built to handle SPARQL CONSTRUCT queries or, in the future, SPARQL 1.1 UPDATE. It is not the goal of the RDFWA WG to cover all possible libraries; innovation is best left to the communities across the Web. The RDFWA WG’s job is to build a basic framework where this innovation can happen. To provide a starting point, a simple API called the RDF API is currently under development. The goal is to provide an easy first step for Web Applications developers that want to mash up exisiting RDF data on the Web without having to dive too deeply into advanced concepts like RDF modeling, inferencing, and the other more complex aspects of RDF. Conceptually, the RDF API is based on the RDF Interfaces specification. Practically, developers only need to use the RDF API and can safely ignore the lower-level RDF Interfaces if they do not have the time or inclination to study RDF in depth.
  3. The RDFa API. Since this journey started with RDFa, and the initial set of requests was for an RDFa API, one is provided for Web Application developers. While the RDF API is a simple entry point for general mash-up applications, the structured data expressed in RDFa-family languages like HTML, SVG, or EPUB require Document Object Model (DOM) features that do not fit nicely into the RDF API. For example, accessing DOM nodes that contain a specific RDFa-encoded subject or predicate. This functionality and a set of simplified interfaces are provided by the RDFa API.

The rest of the article briefly touches on each of these layers. For those that want to learn more, the original drafts provide a more thorough introduction to each layer. While each document is in the draft stage at the W3C, the RDFWA Working Group does not expect large design changes in the coming months.

1. The RDF Interfaces layer

The goal of the RDF Interfaces specification is to provide programmatic access to the core of RDF. That is, it contains interfaces for Triples, Nodes identified by URIs, Blank Nodes, and Literals. There are also interfaces for creating and managing Graphs, including the ability to add and remove triples and merge graphs. The general concept of an RDF parser is expressed, leaving implementations to create the obvious RDF/XML, Turtle and RDFa parsers as well as the less obvious parsers for Microdata or an RDF conversion of well-known Microformats.

In general, the RDF Interfaces layer is fairly similar to existing, widely used RDF libraries like RDFLib and Jena. However, it is not the goal of the RDF Interfaces specification to replace existing RDF libraries; instead, the goal is to provide something similar for Javascript (and, possibly, other Web programming languages that do not have a common interface yet). This goal, i.e., of being optimized for JavaScript does have influence on the definition of the interfaces. For example, JavaScript is extremely flexible in the creation of closures and anonymous callback functions. This flexibility is embraced as an advantage and used in Web Applications. As a consequence, the RDF Interfaces specification has methods on the Graph interface that can receive anonymous functions. For example, the forEach method execute a piece of code on each triple in a Graph. There is also functionality that can automatically execute a developer-supplied function every time a new triple is added to a Graph. The forEach method is modelled after the method with a similar name defined for JavaScript’s Array object. The example below demonstrates how the RDF Interfaces specification can be used to parse a TURTLE document and operate on each triple in the extracted graph:

graph = turtleparser.parse("http://www.example.org/turtle.ttl");
graph.addAll(rdfxmlparser.parse("http://www.example.org/turtle.rdf"));
graph.forEach(function(triple) {
	// Code to process each triple
});

Data can also be removed from the graph. For example, to remove triples whose object is the literal “Ivan”:

// You can write iterative code like the following
tripleArray = graph.toArray();
for(var i = tripleArray.length - 1; i >= 0; i--) {
    if(tripleArray[i].object.nominalValue == "Ivan" ) {
        graph.remove(tripleArray[i])
    }
}

// You can also write something more JavaScript-like
graph.match(null, null, rdf.createLiteral("Ivan")).forEach(function(triple) {
    graph.remove(triple);
});

// Finally, there is a far more compact style for this specific operation,
// as offered by the Graph interface. Note also that the createLiteral("Ivan")
// can be replaced by a simple string
graph.removeMatches(null, null, "Ivan");

2. The RDF API layer

The goal of the RDF API layer is to provide an easy way to create Web Applications that utilize RDF as mash-up data without requiring the developer to understand the details of RDF. The central concept in this layer is the Projection. A Projection bundles a set of triples together that share a common subject and makes it easy to get to the structured data using properties as keys. For example, using the Friend-of-a-Friend vocabulary, one could write the following:

// "ivan" is a Projection that has a specific URI as subject, which we retrieve
// by using a query interface defined in the RDF API.

// Retrieve all of the properties (aka: predicates) associated with Ivan.
props = ivan.getProperties();

// Get the foaf:name for the object, which returns the string "Ivan Herman":
name  = ivan.get("http://xmlns.com/foaf/0.1/name");

// Retrieve the foaf:homepage for the object, which will return
// the string "http://www.ivan-herman.net/"
homepage = ivan.get("http://xmlns.com/foaf/0.1/homepage");

Note how this interfaces hides the difference between a URI and a Literal. This is intentional: for Web Application developers this differentiation is sometimes hard to understand and to follow. While RDF application programmers may care about this distinction, the RDF API level of abstraction does not. If the difference between Literals, URIs and other native RDF constructs is important to your application, use the RDF Interfaces layer.

The question of how we get to a Projection still remains. The RDF API layer has another abstraction called DocumentData, which holds a single Graph containing structured data in the document. Using this object, a Projection can be retrieved. For example:

// Get Ivan’s Friend-of-a-Friend data
ivanFoaf = data.parse(turtleparser, "http://www.ivan-herman.net/foaf.ttl");

// Ivan’s URI in the foaf file, the one that is labelled as a foaf:Person
ivan = ivanFoaf.getProjection("http://www.ivan-herman.net/foaf#me");

// An alternative is to retrieve the Projection by using the foaf:Person
// This code assumes that there is only one foaf:Person in the graph
ivan = ivanFoaf.getProjections("http://www.w3.org/1999/02/22-rdf-syntax-ns#type", "http://xmlns.com/foaf/0.1/Person")[0];

While these examples use absolute URIs to access data, the RDF Interfaces, RDF API and RDFa API allow you to use Compact URI Expressions (or CURIEs) to simplify your code. The APIs allow the developer to manage CURIE prefixes and terms.

3. The RDFa API layer

This layer is provided to give access to the RDFa information in a document and reuses most of the concepts from the RDF API, such as Projections and DocumentData. This layer also defines several extensions to the main Document interface of the DOM. Methods like getElementsBySubject are provided to query for DOM Nodes containing RDFa data.

When it comes to specifications, the devil is in the details. The three drafts described in this article are starting to stabilize. There will be several more publications of drafts, gathering of feedback from various Web communities, addressing public comments, and implementation feedback. However, the RDFWA Working Group has provided a clear indication of the general direction and placed a stake in the ground. It is now up to the Web community to provide feedback and guide these specifications toward a common Semantic Web development environment that will be useful to all Web Application developers. If you would like to provide feedback on the specifications, please send comments to the official RDFWA Working Group mailing list.