Solution demonstrates 10x+ the performance while running on 100x the data
San Diego – August 20, 2014 – SPARQL City, which introduced its scalable graph analytic engine to market earlier this year, today announced that it has successfully run the SP2 SPARQL benchmark on 100 times the data volume as other graph solution providers, while still delivering an order of magnitude better performance on average compared to published results.
SPARQL City ran the SP2 Benchmark against 2.5 billion triples/edges on a sixteen node cluster on Amazon EC2. Average query response time for the set of seventeen queries was about 6 seconds, with query 4, the most data intensive query involving the entire dataset taking approximately 34 seconds to run. By comparison, the best reported query 4 result by other graph solution providers has been around 15 seconds, but this is when running against 25 million triples/edges, or 1/100th of the data volume in SPARQL City’s benchmark test. This level of performance, combined with the ability to easily scale out the solution on a cluster when required, makes easy to use interactive graph analytics on very large datasets possible for the first time. Detailed benchmark results can be found on our website.
“Big Data and the Internet are creating an expansion in analytic possibilities and leading to large investments in new data storage and processing capabilities. However, what’s missing are the right tools and foundational technologies that can remove the high barriers that end users still face when working with the new data sets of the post-web world”, said Barry Zane, Founder and CEO of SPARQL City. “Graph analytics has been proving itself in small pockets over the past few years as an effective solution to this problem, but most of the available solutions in this space do not scale well. With the building adoption of new W3C standards, we recognized that an RDF and SPARQL based solution to graph analytics was well matched to the analytic opportunities that were materializing around Big Data and the web, and our pedigree allowed us to build this solution in a way that addressed the needs of the enterprise.”
SPARQL City was founded by part of the same team that built leading relational MPP analytic companies such as Netezza (acquired by IBM in 2010) and ParAccel (acquired by Actian Corporation in 2013 and the database underpinning Amazon Redshift). These companies had achieved success by delivering orders of magnitude increases in relational analytic performance, while enabling dramatic increases in end user ease of use at the same time. SPARQL City has applied similar technology (massively parallel analytic processing) and rigor (sophisticated query optimization and code generation), to build a next generation standards based graph analytics engine that delivers very high performance and ease of use while providing users the ability to scale out on commodity hardware as required. This, for the first time, makes large scale graph analytics on data sets involving hundreds of billions of edges a practical possibility.
About SPARQL City:
SPARQL City is based in San Diego, CA with offices in Marlborough MA. SPARQL City is a provider of a Hadoop based graph analytics engine that brings to bear the performance and ease-of-use characteristics of massively parallel analytic databases on the world of graph data. The resulting product is built-around modern open standards and delivers orders of magnitude improvements in performance and scale over existing solutions while running on commodity hardware. This opens up new possibilities in the world of graph processing and Big Data analytics. For more information, visit us at: www.sparqlcity.com