Posts Tagged ‘XQuery’

W3C’s Semantic Web Activity Folds Into New Data Activity

rsz_w3clogoThe World Wide Web Consortium has headline news today: The Semantic Web, as well as eGovernment, Activities are being merged and superseded by the Data Activity, where Phil Archer serves as Lead.  Two new workgroups also have been chartered: CSV on the Web and Data on the Web Best Practices.

What’s driving this? First, Archer explains, the Semantic Web technology stack is now mature, and it’s time to allow those updated standards to be used. With RDF 1.1, the Linked Data Platform, SPARQL 1.1, RDB To RDF Mapping Language (R2RML), OWL 2, and Provenance all done or very close to it, it’s the right time “to take that very successful technology stack and try to implement it in the wider environment,” Archer says, rather than continue tinkering with the standards.

The second reason, he notes, is that a large community exists “that sees Linked Data, let alone the full Semantic Web, as an unnecessarily complicated technology. To many developers, data means JSON — anything else is a problem. During the Open Data on the Web workshop held in London in April, Open Knowledge Foundation co-founder and director Rufus Pollock said that if he suggested to the developers that they learn SPARQL he’d be laughed at – and he’s not alone.” Archer says. “We need to end the religious wars, where they exist, and try to make it easier to work with data in the format that people like to work in.”

The new CSV on the Web Working Group is an important step in that direction, following on the heels of efforts such as R2RML. It’s about providing metadata about CSV files, such as column headings, data types, and annotations, and, with it, making it easily possible to convert CSV into RDF (or other formats), easing data integration. “The working group will define a metadata vocabulary and then a protocol for how to link data to metadata (presumably using HTTP Link headers) or embed the metadata directly. Since the links between data and metadata can work in either direction, the data can come from an API that returns tabular data just as easily as it can a static file,” says Archer. “It doesn’t take much imagination to string together a tool chain that allows you to run SPARQL queries against ’5 Star Data’ that’s actually published as a CSV exported from a spreadsheet.”

Read more

Native XML Databases and RDF

Royal Enfield sidecarThere are three trends that I observed at SemTech 2011 in San Francisco last week.  First was the increased role of native XML databases used in combination with RDF data stores.  Second was the many natural-language processing tools and vendors at the conference.  And third was the role of semantic annotations and standards directly in web content.  I think these trends are related.

One of the keynote presentations at the SemTech 2011 conference was done by the BBC.  They presented their core architecture for managing web content as having two main components: a native XML database(MarkLogic)  for content and a RDF triple store for “metadata.”  These tools were at the core of their architecture for their web sites.

Another presentation was done by the Mayo Clinic.  They also are using MarkLogic for web content and are also using semantic web technologies.  Their diagrams show that there are many ways for these systems to interact.

Read more

Building Competency in Semantic Web Technology

In part I of this two-part series, Dean Allemang & Scott Henninger draw on years of teaching TopQuadrant’s introduction course on the Semantic Web to make some observations on teaching Semantic Web concepts to a wide variety of students.

Read more

XSPARQL published as a W3C Submission

The “XSPARQL” specification has been published as a W3C member submission, co-authored by experts of Asemantics S.R.L., DERI Galway, Fundación CTIC, INRIA, Ontotext, OpenLink Software Inc., Profium, Talis Information Ltd., and the University of Innsbruck. This specification defines a merge of SPARQL and XQuery, and has the potential to bring XML and RDF closer together. XSPARQL provides concise and intuitive solutions for mapping between XML and RDF in either direction, addressing both the use cases of GRDDL and SAWSDL.

The Semantics of Meaningful XML Keyword Search Using SQL

Executive Summary
  

XML Keyword Search is still a popular academic subject. It has not reached or been recognized by XML and Internet commercial products yet. The concepts involved are also very important to the semantic web. The semantics industry today with its work on higher level semantics like ontologies and taxonomies has overlooked the importance of utilizing the semantics of hierarchical structured data like XML. When working with hierarchically structured data, the first level of handling semantic understanding must be recognizing the hierarchical structure and its (lower level) hierarchical semantics. This is then used to eliminate false keyword search results that can show up as matches in hierarchical structures; otherwise they will go undetected to the higher level semantic processing which will also not detect them since they are not concerned with the structure of the data. This will cause unmeaningful results to be returned. 

Read more

Entity Extraction and the Semantic Web

Entity Extraction is the process of automatically extracting document metadata from unstructured text documents.  Extracting key entities such as person names, locations, dates, specialized terms and product terminology from free-form text can empower organizations to not only improve keyword search but also open the door to semantic search, faceted search and document repurposing.  This article defines the field of entity extraction, shows some of the technical challenges involved, and shows how RDF can be used to store document annotations. It then shows how new tools such as Apache UIMA are poised to make entity extraction much more cost effective to an organization.

Read more