Dominik Schweiger, Zlatko Trajanoski and Stephan Pabinger recently wrote, “Semantic Web has established itself as a framework for using and sharing data across applications and database boundaries. Here, we present a web-based platform for querying biological Semantic Web databases in a graphical way. Results: SPARQLGraph offers an intuitive drag &drop query builder, which converts the visual graph into a query and executes it on a public endpoint. The tool integrates several publicly available Semantic Web databases, including the databases of the just recently released EBI RDF platform. Furthermore, it provides several predefined template queries for answering biological questions. Users can easily create and save new query graphs, which can also be shared with other researchers.” Read more
Posts Tagged ‘SPARQL’
Solution demonstrates 10x+ the performance while running on 100x the data
San Diego – August 20, 2014 – SPARQL City, which introduced its scalable graph analytic engine to market earlier this year, today announced that it has successfully run the SP2 SPARQL benchmark on 100 times the data volume as other graph solution providers, while still delivering an order of magnitude better performance on average compared to published results.
SPARQL City ran the SP2 Benchmark against 2.5 billion triples/edges on a sixteen node cluster on Amazon EC2. Average query response time for the set of seventeen queries was about 6 seconds, with query 4, the most data intensive query involving the entire dataset taking approximately 34 seconds to run. By comparison, the best reported query 4 result by other graph solution providers has been around 15 seconds, but this is when running against 25 million triples/edges, or 1/100th of the data volume in SPARQL City’s benchmark test. This level of performance, combined with the ability to easily scale out the solution on a cluster when required, makes easy to use interactive graph analytics on very large datasets possible for the first time. Detailed benchmark results can be found on our website.
If you’re interested in Linked Data, no doubt you’re planning to listen in on next week’s Semantic Web Blog webinar, Getting Started With The Linked Data Platform (register here), featuring Arnaud Le Hors, Linked Data Standards Lead at IBM and chair of the W3C Linked Data Platform WG and the OASIS OSLC Core TC. It also may be on your agenda to attend this month’s Semantic Web Technology & Business Conference, where speakers including Le Hors, Manu Sporny, Sandro Hawke, and others will be presenting Linked Data-focused sessions.
In the meantime, though, you might enjoy reviewing the results of the LOD2 Project, the European Commission co-funded effort whose four-year run, begun in 2010, aimed at advancing RDF data management; extracting, creating and enriching structured RDF data; interlinking data from different sources; and authoring, exploring and visualizing Linked Data. To that end, why not take a stroll through the recently released Linked Open Data – Creating Knowledge Out of Interlinked Data, edited by LOD2 Project participants Soren Auer of the Institut für Informatik III Rheinische Friedrich-Wilhelms-Universität; Volha Bryl of the University of Mannheim, and Sebastian Tramp of the University of Leipzig?
Is SPARQL the SQL for NoSQL? The question will be discussed at this month’s Semantic Technology & Business Conference in San Jose by Arthur Keen, vp of solution architecture of startup SPARQL City.
It’s not the first time that the industry has considered common database query languages for NoSQL (see this story at our sister site Dataversity.net for some perspective on that). But as Keen sees it, SPARQL has the legs for the job. “What I know about SPARQL is that for every database [SQL and NoSQL alike] out there, someone has tried to put SPARQL on it,” he says, whereas other common query language efforts may be limited in database support. A factor in SPARQL’s favor is query portability across NoSQL systems. Additionally, “you can achieve much higher performance using declarative query languages like SPARQL because they specify the ‘What’ and not the ‘How’ of the query, allowing optimizers to choose the best way to implement the query,” he explains.
Standard Analytics, which was a participant at the recent TechStars event in New York City, has a big goal on its mind: To organize the world’s scientific information by building a complete scientific knowledge graph.
The company’s co-founders, Tiffany Bogich and Sebastien Ballesteros,came to the conclusion that someone had to take on the job as a result of their own experience as researchers. A problem they faced, says Bogich, was being able to access all the information behind published results, as well as search and discover across papers. “Our thesis is that if you can expose the moving parts – the data, code, media – and make science more discoverable, you can really advance and accelerate research,” she says.
[Editor's Note: Since 2007, Sindice.com has served as a specialized search engine allowing Semantic Web practitioners and researchers to locate structured data on the Web. At the peak of its activity, Sindice.com had an index of over 700M pages and processed 20M pages per day. In a post last week, the founding team announced the end of support for Sindice.com to concentrate on delivering the technology developed for the engine to enterprise users. This week, SemanticWeb.com is proud to host a guest post by the founding team explaining the history, the challanges and the future of this technology.]
The word “Sindice” has been around for quite some time in research and practice on the “Semantic Web” or “lets see how we can turn the web into a database”.
Since 2007, Sindice.com has served as a specialized search engine that would do a crazy thing: throw away the text and just concentrate on the “markup” of the web pages. Sindice would provide an advanced API to query RDF, RDFa, Microformats and Microdata found on web sites, together with a number of other services. Sindice turned useful, we guess, as approximately 1100 scientific works in the last few years refer to it in a way or another.
Last week, we the founding team announced the end of our support of the original Sindice.com semantic search engine to concentrate on the technology that came from it.
With the launch in 2012 of Schema.org, Google and others have effectively embraced the vision of the “Semantic Web.” With the RDFa standard, and now even more with JSON-LD, richer markup is becoming more and more popular on websites. While there might not be public web data “search APIs”, large collections of crawled data (pages and RDF) exist today which are made available on cloud computing platforms for easy analysis with your favorite big data paradigm.
Even more interestingly, the technology of Sindice.com has been made available in several projects maintained either as open source (see below) or commercially supported by the Sindice.com team now transitioned in the Sindice LTD company, AKA SindiceTech.
It has been quite a journey for us, and given there is no single summary anywhere we thought we’d take this occasion to write and share it.
This is both for “historical” reasons and as a way to glimpse at future directions of this field and these technologies.
The CHAIN-REDS FP7 project, co-funded by the European Commission, has as a goal building a knowledge base of information, gathered both from dedicated surveys and other web and document sources, for largely more than half of the countries in the world, which it presents to visitors through geographic maps and tables. Earlier this month, its Knowledge Base and Semantic Search Engine for exploring the more than 30 million documents in its Open Access Document Repositories (OADR) and Data Repositories (DR) became available in a smartphone and tablet app, while the results of its Semantic Search Engine also now are ranked according to the January 2014 Ranking Web of Repositories. So, users conducting searches should see results in the order of the highest-ranked repositories.
The project has its roots in using semantic web technologies to correlate the data used to write scientific papers with the documents themselves whenever possible, says Prof. Roberto Barbera, of the Department of Physics and Astronomy at the University of Catania, as well as with applications that can be used to analyse the information. To drive to these ends, the CHAIN-REDS consortium semantically enriched its repositories and built its search engine on the related Linked Data. Users in search of information can get papers and data and, if applications are available, can be redirected to them on the project’s cloud infrastructure to reproduce and reanalyze the data.
“There is a huge effort in the scientific world about the reproducibility of science,” says Barbera.
Today the Web celebrates its 25th birthday, and we celebrate the Semantic Web’s role in that milestone. And what a milestone it is: As of this month, the Indexed Web contains at least 2.31 billion pages, according to WorldWideWebSize.
The Semantic Web Blog reached out to the World Wide Web Consortium’s current and former semantic leads to get their perspective on the roads The Semantic Web has traveled and the value it has so far brought to the Web’s table: Phil Archer, W3C Data Activity Lead coordinating work on the Semantic Web and related technologies; Ivan Herman, who last year transitioned roles at the W3C from Semantic Activity Lead to Digital Publishing Activity Lead; and Eric Miller, co-founder and president of Zepheira and the leader of the Semantic Web Initiative at the W3C until 2007.
While The Semantic Web came to the attention of the wider public in 2001, with the publication in The Scientific American of The Semantic Web by Tim Berners-Lee, James Hendler and Ora Lassila, Archer points out that “one could argue that the Semantic Web is 25 years old,” too. He cites Berners-Lee’s March 1989 paper, Information Management: A Proposal, that includes a diagram that shows relationships that are immediately recognizable as triples. “That’s how Tim envisaged it from Day 1,” Archer says.
Almost exactly 10 years after the publication of RDF 1.0 (10 Feb 2004, http://www.w3.org/TR/rdf-concepts/), the World Wide Web Consortium (W3C) has announced today that RDF 1.1 has become a “Recommendation.” In fact, the RDF Working Group has published a set of eight Resource Description Framework (RDF) Recommendations and four Working Group Notes. One of those notes, the RDF 1.1 primer, is a good starting place for those new to the standard.
Lanthaler said of the recommendation, “Semantic Web technologies are often criticized for their complexity–mostly because RDF is being conflated with RDF/XML. Thus, with RDF 1.1 we put a strong focus on simplicity. The new specifications are much more accessible and there’s a clear separation between RDF, the data model, and its serialization formats. Furthermore, the primer provides a great introduction for newcomers. I’m convinced that, along with the standardization of Turtle (and previously JSON-LD), this will mark an important point in the history of the Semantic Web.”
NEXT PAGE >>