Emil Eifrem of Neo4j recently wrote, “Dr. Roy Marsten wrote in in March that Graph Theory was a key approach in understanding and leveraging big data. As a advocate of graph theory and as a developer building graph databases since 2003, it was wonderful to read someone else with similar insights and appetites. As Dr. Marsten notes, Google started the graph analysis trend in the modern era using links between documents on the Web to understand their semantic context. As a result, Google produced a Web search engine that massively outperformed its established competitors and saw it jump so far ahead that ‘to Google’ became a verb. Of course we know very well Google’s history since then: its graph-centric approach has seen it deliver innovation at scale and dominate not only in its core search market, but also across the information management space.” Read more
Posts Tagged ‘Neo4J’
At Decibel, provider of metadata-driven music discovery APIs, Neo4j has a featured role in a learning project that is the start of a plan to replace the relational database of 5 million tracks from 1.1 million albums by 300,000 artists – and the world of connections around that data – with a NoSQL graph database. With Decibel’s APIs, customers like the Blue Note Records jazz label, in partnership with developer Groovebug, have turned their record collections into a virtual record store, including track listings, individual track participations, recording session venues and dates.
With the APIs that tap into Decibel and fold into their own programs, developers at record labels, MP3 services and other digital music/entertainment or other venues can connect everything from the debut date of the bootleg Thin Lizzy album, Remembering Part 1, to its number of tracks on it to Sade’s 2011 coverage of Still in Love With You, to accommodate music-lovers’ search and discovery experiences. Or they’ll be able to surface that a piece of classical music in German is the same as another piece referenced by its French name, or that a musician that has gone by three different names in his career is one and the same.
The debate about PRISM continues. One of the latest volleys was posted in InformationWeek by Coverlet Meshing (a pseudonym used by “a senior IT executive at one of the nation’s largest banks.”) Meshing wrote: “Prism doesn’t scare me. On 9/11, my office was on the 39th floor of One World Trade. I was one of the many nameless people you saw on the news running from the towers as they collapsed. But the experience didn’t turn me into a hawk. In fact, I despise the talking heads who frame Prism as the price we pay for safety. And not just because they’re fear-mongering demagogues. I hate them because I’m a technologist and they’re giving technology a bad name.” Read more
“We want to help the world make sense of data and we think graphs are the best way of doing that.”
That’s the word from Emil Eifrem, CEO of Neo Technology, which makes the open-source Neo4j NoSQL graph database. He’s not talking in terms of RDF-centric solutions, even though he says he’s 100 percent in agreement with the vision of the semantic web and machine readability. “The world is a graph,” Eifrem says, “and RDF is a great way of connecting things. I’m all in agreement there.” The problem, in his opinion, is that execution on the software end there has been lacking.
“This comes down to usability,” he says, and the average developer, he believes, finds the semantic web-oriented tools largely incomprehensible. Eifrem says he’s speaking from real-world experiences, having worked directly with RDF and taught classes on the semantic web layers. Where it took a week to get students up to speed on things like Jena and Sesame, they ‘get’ the property graph and graph databases in half-a-day, he says. Neo4j stores data in nodes connected by directed, typed relationships with properties on both – also known as a property graph.
Triplestores are Database Management Systems (DBMS) for data modeled using RDF. Unlike Relational Database Management Systems (RDBMS), which store data in relations (or tables) and are queried using SQL, triplestores store RDF triples and are queried using SPARQL.
A key feature of many triplestores is the ability to do inference. It is important to note that a DBMS typically offers the capacity to deal with concurrency, security, logging, recovery, and updates, in addition to loading and storing data. Not all Triplestores offer all these capabilities (yet).
Triplestores can be broadly classified in three types categories: Native triplestores, RDBMS-backed triplestores and NoSQL triplestores. Read more
Emil Eifrem, founder of Neo4j has written an article for Mashable about the rise of graph databases. He writes, “Until the NOSQL wave hit a few years ago, the least fun part of a project was dealing with its database. Now there are new technologies to keep the adventuresome developer busy. The catch is, most of these post-relational databases, such as MongoDB, Cassandra, and Riak, are designed to handle simple data. However, the most interesting applications deal with a complex, connected world. A new type of database significantly changes the standard direction taken by NOSQL. Graph databases, unlike their NOSQL and relational brethren, are designed for lightning-fast access to complex data found in social networks, recommendation engines and networked systems.” Read more
Semantic Web Community: I’m disappointed in us! Or at least in our group marketing prowess. We have been failing to capitalize on two major trends that everyone has been talking about and that are directly addressable by Semantic Web technologies! For shame.
I’m talking of course about Big Data and NoSQL. Given that I’ve already given my take on how Semantic Web technology can help with the Big Data problem on SemanticWeb.com, this time around I’ll tackle NoSQL and the Semantic Web.
After all, we gave up SQL more than a decade ago. We should be part of the discussion. Heck, even the XQuery guys got in on the action early!
Check out this Google Trends diagram.
NoSQL came out of nowhere in 2009, and now dominates much of the database conversation on the web. Document stores like MongoDB and CouchDB, distributed, key-value stores such as Riak and Cassandra, and other weird stores like Hadoop-as-database (never understood that usage myself) now dominate the conversation as the alternative to traditional, SQL databases.