The Pacific Northwest National Laboratory recently reported on Phys.org, “As computing tools and expertise used in conducting scientific research continue to expand, so have the enormity and diversity of the data being collected. Developed at Pacific Northwest National Laboratory, the Graph Engine for Multithreaded Systems, or GEMS, is a multilayer software system for semantic graph databases. In their work, scientists from PNNL and NVIDIA Research examined how GEMS answered queries on science metadata and compared its scaling performance against generated benchmark data sets. They showed that GEMS could answer queries over science metadata in seconds and scaled well to larger quantities of data.” Read more
Semantic Interoperability of Electronic Healthcare Info On The Agenda At U.S. Veterans Health Administration
The Yosemite Project, unveiled at this August’s Semantic Technology & Business Conference during the second annual RDF as a Universal Healthcare Exchange Language panel, lays out a roadmap for leveraging RDF in support of making all structured healthcare information semantically interoperable. (The Semantic Web Blog’s sister publication, Dataversity.net, has an article on its site explaining the details of that roadmap.)
The Yosemite Project grew out of the Yosemite Manifesto that was announced at the 2013 SemTechBiz conference (see our story here). The goals of the Manifesto have now been mapped out into the Project’s guidelines to follow on the journey to semantic interoperability by David Booth, senior software architect at Hawaii Resource Group (who led the RDF Healthcare panels at both the 2013 and 2014 conferences). The approach taken by the Yosemite Project matches that of others in the healthcare sector who want to see semantic interoperability of electronic healthcare information.
Among them are Booth’s fellow panelists at this year’s event, including Rafael Richards. Richards, who is physician informaticist at the U.S. Veterans Health Administration – which counts 1,200 care sites in its portfolio – comments on that alignment as it relates to the work he is leading in the Linked Vitals project to integrate the VA’s VistA electronic health records system with data types conforming to Fast Healthcare Interoperability Resources, orFHIR,standard for data exchange, and with information types supporting the Logical Observation Identifiers Names and Codes, or LOINC, database that facilitates the exchange and pooling of results for clinical care, outcomes management, and research.
Aaron Bradley recently posted a roundtable discussion about JSON-LD which includes: “JSON-LD is everywhere. Okay, perhaps not everywhere, but JSON-LD loomed large at the 2014 Semantic Web Technology and Business Conference in San Jose, where it was on many speakers’ lips, and could be seen in the code examples of many presentations. I’ve read much about the format – and have even provided a thumbnail definition of JSON-LD in these pages – but I wanted to take advantage of the conference to learn more about JSON-LD, and to better understand why this very recently-developed standard has been such a runaway hit with developers. In this quest I could not have been more fortunate than to sit down with Gregg Kellogg, one of the editors of the W3C Recommendation for JSON-LD, to learn more about the format, its promise as a developmental tool, and – particularly important to me as a search marketer – the role in the evolution of schema.org.”
Dominik Schweiger, Zlatko Trajanoski and Stephan Pabinger recently wrote, “Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way. Results: SPARQLGraph offers an intuitive drag &drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers.” Read more
Solution demonstrates 10x+ the performance while running on 100x the data
San Diego – August 20, 2014 – SPARQL City, which introduced its scalable graph analytic engine to market earlier this year, today announced that it has successfully run the SP2 SPARQL benchmark on 100 times the data volume as other graph solution providers, while still delivering an order of magnitude better performance on average compared to published results.
SPARQL City ran the SP2 Benchmark against 2.5 billion triples/edges on a sixteen node cluster on Amazon EC2. Average query response time for the set of seventeen queries was about 6 seconds, with query 4, the most data intensive query involving the entire dataset taking approximately 34 seconds to run. By comparison, the best reported query 4 result by other graph solution providers has been around 15 seconds, but this is when running against 25 million triples/edges, or 1/100th of the data volume in SPARQL City’s benchmark test. This level of performance, combined with the ability to easily scale out the solution on a cluster when required, makes easy to use interactive graph analytics on very large datasets possible for the first time. Detailed benchmark results can be found on our website.
WASHINGTON, D.C. – SYSTAP, LLC. today announced that Syapse, the leading provider of software for enabling precision medicine, has selected Bigdata® as its backend semantic database. Syapse, which launched the Precision Medicine Data Platform in 2011, will use the Bigdata® database as a key element of their semantic platform. The Syapse Precision Medicine Data Platform integrates medical data, omics data, and biomedical knowledge for use in the clinic. Syapse software is delivered as a cloud-based SaaS, enabling access from anywhere with an internet connection, regular software updates and new features, and online collaboration and delivery of results, with minimal IT resources required. Syapse applications comply with HIPAA/HITECH, and data in the Syapse platform are protected according to industry standards.
Syapse’s Precision Medicine Data Platform features a semantic layer that provides powerful data modeling, query, and integration functionality. According to Syapse CTO and Co-Founder, Tony Loeser, Ph.D., “We have adopted SYSTAP’s graph database, Bigdata®, as our RDF store. Bigdata’s exceptional scalability, query performance, and high-availability architecture make it an enterprise-class foundation for our semantic technology stack.”
Is SPARQL the SQL for NoSQL? The question will be discussed at this month’s Semantic Technology & Business Conference in San Jose by Arthur Keen, vp of solution architecture of startup SPARQL City.
It’s not the first time that the industry has considered common database query languages for NoSQL (see this story at our sister site Dataversity.net for some perspective on that). But as Keen sees it, SPARQL has the legs for the job. “What I know about SPARQL is that for every database [SQL and NoSQL alike] out there, someone has tried to put SPARQL on it,” he says, whereas other common query language efforts may be limited in database support. A factor in SPARQL’s favor is query portability across NoSQL systems. Additionally, “you can achieve much higher performance using declarative query languages like SPARQL because they specify the ‘What’ and not the ‘How’ of the query, allowing optimizers to choose the best way to implement the query,” he explains.
There’s a chance to learn everything you should know about RDF to get the most value from the W3C standard model for data interchange at the 10th annual Semantic Technology & Business Conference in San Jose next month. David Booth, senior software architect at Hawaii Resource Group, will be hosting a session explaining how the standard’s unique capabilities can have a profound effect on projects that seek to connect data coming in from multiple sources.
“One of the assumptions that people make looking at RDF is that it is analogous to any other data format, like JSON or XML,” says Booth, who is working at the Hawaii Research Group’s on a contract the firm has with the U.S. Department of Defense to use semantic web technologies to achieve healthcare data interoperability. “It isn’t.” RDF, he explains, isn’t just another data format – rather, it’s about the information content that is encoded in the format.
“The focus is different. It is on the meaning of data vs. the details of syntax,” he says.
Straight out of Google I/O this week, came some interesting announcements related to Semantic Web technologies and Linked Data. Included in the mix was a cool instructional video series about how to “Build a Small Knowledge Graph.” Part 1 was presented by Jarek Wilkiewicz, Knowledge Developer Advocate at Google (and SemTechBiz speaker).
Wilkiewicz fits a lot into the seven-and-a-half minute piece, in which he presents a (sadly) hypothetical example of an online music store that he creates with his Google colleague Shawn Simister. During the example, he demonstrates the power and ease of leveraging multiple technologies, including the schema.org vocabulary (particularly the recently announced ‘Actions‘), the JSON-LD syntax for expressing the machine readable data, and the newly launched Cayley, an open source graph database (more on this in the next post in this series).