How Job Sites Could Improve with Semantic Web Technologies

8214124711_a9f6738627_zKurt Cagle, a Principal Evangelist for Semantic Technologies at Avalon Consulting recently wrote, “I’m not a recruiter. I have from time to time submitted resumés for jobs to Monster or Linked-In to individual company sites as a developer or architect, but even there I’ve discovered what millions of job hunters already know: submitting online resumés is a pain. Consider the process. You create a profile, identifying yourself to job submission system X. This site may or may not have a way of uploading a text resumé, but one thing you find in the data management space is that structure matters, and the farther you deviate from the structure, the harder it is for some OCR Artificial Intelligence to actually make sense of what you’ve written.” Read more

The Apache Software Foundation Celebrates 15 Years of Open Source Innovation and Community Leadership

apacheBudapest, Hungary, Nov. 19, 2014 (GLOBE NEWSWIRE) — At ApacheCon Europe, members of the Apache community commemorated The Apache Software Foundation (ASF)’s fifteenth anniversary and congratulated the people, projects, initiatives, and organizations that played a role in its success.

Recognized as the leader in community-led Open Source software development, the ASF was established to shepherd, develop, and incubate Open Source innovations “The Apache Way”. Reflections on achievements over the past 15 years include: Read more

Web Components: Even Better With Semantic Markup

W3C LogoThe W3C’s Web Components model is positioned to solve many of the problems that beset web developers today. “Developers are longing for the ability to have reusable, declarative, expressive components,” says Brian Sletten, a specialist in semantic web and next-generation technologies, software architecture, API design, software development and security, and data science, and president of software consultancy Bosatsu Consulting, Inc.

Web Components should fulfill that longing: With Templates, Custom Elements, Shadow DOM, and Imports draft specifications (and thus still subject to change), developers get a set of specifications for creating their web applications and elements as a set of reusable components. While most browsers don’t yet support these specifications, there are Web Component projects like Polymer that enable developers who want to start taking advantage of these capabilities right away to build Web objects and applications atop the specs today.

“With this kind of structure in place, now there is a market for people to create components that can be reused across any HTML-based application or document,” Sletten says. “There will be an explosion of people building reusable components so that you and I can use those elements and don’t have to write a ton of obnoxious JavaScript to do certain things.”

That in itself is exciting, Sletten says, but even more so is the connection he made that semantic markup can be added to any web component.

Read more

How to Query Walmart and BestBuy Data with SPARQL

sneeBob DuCharme recently wrote, “The combination of microdata and seems to have hit a sweet spot that has helped both to get a lot of traction. I’ve been learning more about microdata recently, but even before I did, I found that the W3C’s Microdata to RDF Distiller written by Ivan Herman would convert microdata stored in web pages into RDF triples, making it possible to query this data with SPARQL. With major retailers such as Walmart and BestBuy making such data available on—as far as I can tell—every single product’s web page, this makes some interesting queries possible to compare prices and other information from the two vendors.” Read more

New Study Shows Electronic Health Records on the Rise

5669123427_44d1da0ebf_mBob Violino of Information Management reports, “Electronic health records (EHR) uptake in the U.S. has accelerated dramatically as a result of government initiatives and the considerable resources healthcare providers have invested over the past five years, says research firm Frost & Sullivan. Electronic health records have become the heart of health IT, the firm says, and U.S. clinicians use them on a daily basis. Frost & Sullivan’s latest health IT analysis, ‘EHR Usability—CIOs Weigh in On What’s Needed to Improve Information Retrieval,’ finds that as the market matures and the volume of EHR data proliferates, ensuring reliable information retrieval from EHRs at the point-of-care will become a priority for healthcare providers.” Read more

Retrieving and Using Taxonomy Data from DBpedia

DBpedia logo on a halloween jack-o-lanternDBpedia, as described in the recent article DBpedia 2014 Announced, is “a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web.” It currently has over 3 billion triples (that is, facts stored using the W3C standard RDF data model) available for use by applications, making it a cornerstone of the semantic web.

A surprising amount of this data is expressed using the SKOS vocabulary, the W3C standard model for taxonomies used by the Library of Congress, the New York Times, and many other organizations to publish their taxonomies and subject headers. ( has covered SKOS many times in the past.) DBpedia has data about over a million SKOS concepts, arranged hierarchically and ready for you to pull down with simple queries so that you can use them in your RDF applications to add value to your own content and other data.

Where is this taxonomy data in DBpedia?

Many people think of DBpedia as mostly storing the fielded “infobox” information that you see in the gray boxes on the right side of Wikipedia pages—for example, the names of the founders and the net income figures that you see on the right side of the Wikipedia page for IBM. If you scroll to the bottom of that page, you’ll also see the categories that have been assigned to IBM in Wikipedia such as “Companies listed on the New York Stock Exchange” and “Computer hardware companies.” The Wikipedia page for Computer hardware companies lists companies that fall into this category, as well as two other interesting sets of information: subcategories (or, in taxonomist parlance, narrower categories) such as “Computer storage companies” and “Fabless semiconductor companies,” and then, at the bottom of the page, categories that are broader than “Computer hardware companies” such as “Computer companies” and “Electronics companies.”

How does DBpedia store this categorization information? The DBpedia page for IBM shows that DBpedia includes triples saying that IBM has Dublin Core subject values such as category:Companies_listed_on_the_New_York_Stock_Exchange and category:Computer_hardware_companies. The DBpedia page for the category Computer_hardware_companies shows that is a SKOS concept with values for the two key properties of a SKOS concept: a preferred label and broader values. The category:Computer_hardware_companies concept is itself the broader value of several other concepts such as category:Fabless_semiconductor_companies. Because it’s the broader value of other concepts and has its own broader values, it can be both a parent node and a child node in a tree of taxonomic terms, so DBpedia has the data that lets you build a taxonomy hierarchy around any of its categories.

Read more

NEW WEBINAR Announced: Yosemite Project – Part 3

“Transformations for Integrating VA data with FHIR in RDF”

Yosemite Project Part 3: Part 3-Transformations for Integrating VA data with FHIR in RDF recently launched a series of webinars on the topic of “RDF as a Universal Healthcare Exchange Language.”

Part 1 of that series, “The Yosemite Project: An RDF Roadmap for Healthcare Information Interoperability,” is available as a recorded webinar and slide deck.

Part 2,The Ideal Medium for Health Data? A Dive into Lab Tests,” will take place on November 7, 2014 (registration is open as of this writing).

Announcing Part 3:

click here to register now!
TITLE: Transformations for Integrating VA data with FHIR in RDF
DATE: Wednesday, November 12, 2014
TIME: 2 PM Eastern / 11 AM Pacific
PRICE: Free to all attendees
DESCRIPTION: In our series on The Yosemite Project, we explore RDF as a data standard for health data. In this installment, we will hear from Rafael Richards, Physician Informatician, Office of Informatics and Analytics in the Veterans Health Administration (VHA), about “Transformations for Integrating VA data with FHIR in RDF.”

The VistA EHR has its own data model and vocabularies for representing healthcare data. This webinar describes how SPARQL Inference Notation (SPIN) can be used to translate VistA data to the data represented used by FHIR, an emerging interchange standard.


Read more

WEBINAR: The Yosemite Project – Part 1: An RDF Roadmap for Healthcare Information Interoperability (VIDEO)

The Yosemite Project - Part 1In case you missed last Friday’s webinar, “The Yosemite Project – Part 1: An RDF Roadmap for Healthcare Information Interoperability” delivered by David Booth, the recording and slides are now available (and posted below). The webinar was co-produced by and and runs for one hour, including a Q&A session with the audience that attended the live broadcast.

If you watch this webinar, please use the comments section below to share your questions, comments, and ideas for webinars you would like to see in the future.

About the Webinar

Interoperability of electronic healthcare information remains an enormous challenge in spite of 100+ available healthcare information standards. This webinar explains the Yosemite Project, whose mission is to achieve semantic interoperability of all structured healthcare information through RDF as a common semantic foundation. It explains the rationale and technical strategy of the Yosemite Project, and describes how RDF and related standards address a two-pronged strategy for semantic interoperability: facilitating collaborative standards convergence whenever possible, and crowd-sourced data translations when necessary.

Read more

Introducing GEMS, a Multilayer Software System for Graph Databases

gemsThe Pacific Northwest National Laboratory recently reported on, “As computing tools and expertise used in conducting scientific research continue to expand, so have the enormity and diversity of the data being collected. Developed at Pacific Northwest National Laboratory, the Graph Engine for Multithreaded Systems, or GEMS, is a multilayer software system for semantic graph databases. In their work, scientists from PNNL and NVIDIA Research examined how GEMS answered queries on science metadata and compared its scaling performance against generated benchmark data sets. They showed that GEMS could answer queries over science metadata in seconds and scaled well to larger quantities of data.” Read more

GitHub Adds Actions to Email Notifications via JSON-LD

GitHub logoStéphane Corlosquet has noticed that GitHub has added Actions using the JSON-LD syntax to the notification emails that GitHub users receive.

On Twitter, Corlosquet posted:

Tweet from @scolorquet: "Looks like @github just started to use  actions with JSON-LD in their notifications emails! "

Read more