Posts Tagged ‘MongoDB’

SPARQL And NoSQL: A Match On Many Levels

site-header-10th-blog-304x200Is SPARQL the SQL for NoSQL? The question will be discussed at this month’s Semantic Technology & Business Conference in San Jose by Arthur Keen, vp of solution architecture of startup SPARQL City.

It’s not the first time that the industry has considered common database query languages for NoSQL (see this story at our sister site Dataversity.net for some perspective on that). But as Keen sees it, SPARQL has the legs for the job. “What I know about SPARQL is that for every database [SQL and NoSQL alike] out there, someone has tried to put SPARQL on it,” he says, whereas other common query language efforts may be limited in database support. A factor in SPARQL’s favor is query portability across NoSQL systems. Additionally, “you can achieve much higher performance using declarative query languages like SPARQL because they specify the ‘What’ and not the ‘How’ of the query, allowing optimizers to choose the best way to implement the query,” he explains.

Read more

Semantically Aligned Design Principles At Core of Australian Electronic Health Records Platform

site-header-10th-blog-304x200At the upcoming Semantic Technology & Business Conference in San Jose, Dr. Terry Roach, principal of  CAPSICUM Business Architects, and Dr. Dean Allemang, principal consultant at Working Ontologist, will host a session on A Semantic Model for an Electronic Health Record (EHR). It will focus on Australia’s electronic-Health-As-A-Service  (eHaas) national platform for personal electronic health records, provided by the CAPSICUM semantic framework for strategically aligned business architectures.

Roach and Allemang participated in an email interview with The Semantic Web Blog to preview the topic:

The Semantic Web Blog: Can you put the work you are doing on the semantic EHR model in context: How does what Australia is doing with its semantic framework compare with how other countries are approaching EHRs and healthcare information exchange?

Roach and Allemang: The eHaaS project that we have been working on has been an initiative of Telstra, a large, traditional telecommunications provider in Australia. Its Telstra Health division, which is focused on health-related software investments, for the past two years has embarked on a set of strategic investments in the electronic health space. Since early 2013 it has acquired and/or established strategic partnerships with a number of local and international healthcare software providers ranging from hospital information systems [to] mobile health applications [to] remote patient monitoring systems to personal health records [to] integration platforms and health analytics suites.

At the core of these investments is a strategy to develop a platform that captures and maintains diverse health-related interactions in a consolidated lifetime health record for individuals. The eHaaS platform facilitates interoperability and integration of several health service components over a common secure authentication service, data model, infrastructure, and platform. Starting from a base of stand-alone, vertical applications that manage fragmented information across the health spectrum, the eHaaS platform will establish an integrated, continuously improving, shared healthcare data platform that will aggregate information from a number of vertical applications, as well as an external gateway for standards-based eHealth messages, to present a unified picture of an individual’s health care profile and history.

Read more

SIREn Schemaless Structured Doc Search System Zips Through Complex Nested Document Search

sirenSchemaless structured document search system SIREn (Semantic Information Retrieval ENgine) has posted some impressive benchmarks for a demonstration it did of its prowess in searching complex nested documents. A blog here discusses the test, which indexed a collection of about 44,000 U.S. patent grant documents, with an average of 1,822 nested objects per doc, comparing Lucene’s Blockjoin capability to SIREn.

The finding for the test dataset: “Blockjoin required 3,077MB to create facets over the three chosen fields and had a query time of 90.96ms. SIREn on the other hand required just 126 MB with a query time of 8.36ms. Blockjoin required 2442% more memory while being 10.88 times slower!”

SIREn, which was launched into its own website and community as part of SindiceTech’s relaunch (see our story here), attributes the results to its use of a fundamentally different conceptual model from the Blockjoin approach. In-depth tech details of the test are discussed here. There it also is explained that while the focus of the document is Lucene/Solr, the results are identically applicable to ElasticSearch which, under the hood, uses Lucene’s Blockjoin to support nested documents.

The Semantic Web Blog also checked in with SindiceTech CEO Giovanni Tummarello to get a further read on how SIREn has evolved since the relaunch to enable such results, and in other respects.

Read more

Gartner Uncovers Who’s Cool In The Supply Chain

Photo courtesy: Flickr/a loves dc

Photo courtesy: Flickr/a loves dc

Gartner recently released its report dubbed, “Cool Vendors in Supply Chain Services,” which gives kudos to providers that use cloud computing as an enabler or delivery mechanism for capabilities that help enterprises to better manage their supply chains.

On that list of vendors building cloud solutions and leveraging big data and analytics to optimize the supply chain is startup Elementum, which The Semantic Web Blog initially covered here and which envisions the supply chain as a complex graph of connections. As we reported previously, Elementum’s back-end is based on a real-time Java, MongoDB NoSQL document database and flexible schema graph database to store and map the nodes and edges of a supply chain graph. A URI is used for identifying data resources and metadata, and a federated platform query language makes it possible to access multiple types of data using that URI, regardless of what type of database it is stored in. Mobile apps provide end users access to managing transportation networks, respond to supply chain risks, and monitor the health of the supply chain.

Gartner analyst Michael Dominy writes in the report that Elementum earns its cool designation in part for its exploitation of Gartner’s Nexus of Forces, which the research firm describes as the convergence and mutual reinforcement of social, mobility, cloud and information patterns that drive new business scenarios.

Read more

SindiceTech Relaunch Features SIREn Search System, PivotBrowser Relational Faceted Browser

sindiceLast week news came from SindiceTech about the availability of its SindiceTech Freebase Distribution for the cloud (see our story here). SindiceTech has finalized its separation from the university setting in which it incubated, the former DERI institute, now a part of the Insight Center for Data Analytics, and now is re-launching its activities, with more new solutions and capabilities on the way.

“The first thing was to launch the Knowledge Graph distribution in the cloud,” says CEO Giovanni Tummarello. “The Freebase distribution showcases how it is possible to quickly have a really large Knowledge Graph in one’s own private cloud space.” The distribution comes instrumented with some of the tools SindiceTech has developed to help users both understand and make use of the data, he says, noting that “the idea of the Knowledge Graph is to have a data integration space that makes it very simple to add new information, but all that power is at risk of being lost without the tools to understand what is in the Knowledge Graph.”

Included in the first round of the distribution’s tools for composing queries and understanding the data as a whole are the Data Types Explorer (in both tabular and graph versions), and the Assisted SPARQL Query Editor. The next releases will increase the number of tools and provide updated data. “Among the tools expected is an advanced Knowledge Graph entity search system based on our newly released SIREn search system,” he says.

Read more

The Supply Chain Is One Big Graph In Start-up Elementum’s Platform

rsz_elementum_transport_appStartup Elementum wants to take supply chains into the 21st century. Incubated at Flextronics, the second largest contract manufacturer in the world, and launching today with $44 million in Series B funding from that company and Lightspeed Ventures, its approach is to get supply chain participants – the OEMs that generate product ideas and designs, the contract manufacturers who build to those specs, the component makers who supply the ingredients to make the product, the various logistics hubs to move finished product to market, and the retail customer – to drop the one-off relational database integrations and instead see the supply chain fundamentally as a complex graph or web of connections.

“It’s no different thematically from how Facebook thinks of its social network or how LinkedIn thinks of what it calls the economic graph,” says Tyler Ziemann, head of growth at Elementum. Built on Amazon Web Services, Elementum’s “mobile-first” apps for real-time visibility, shipment tracking and carrier management, risk monitoring and mitigation, and order collaboration have a back-end built to consume and make sense of both structured and unstructured data on-the-fly, based on a real-time Java, MongoDB NoSQL document database to scale in a simple and less expensive way across a global supply chain that fundamentally involves many trillions of records, and flexible schema graph database to store and map the nodes and edges of the supply chain graph.

“Relational database systems can’t scale to support the types of data volumes we need and the flexibility that is required for modeling the supply chain as a graph,” Ziemann says.

Read more

Bottlenose Nerve Center Debuts, Bringing The Artificial Analyst To The Enterprise

rsz_botnosenewThe enterprise version of Bottlenose has formally launched. Now dubbed Nerve Center, the service to provide real-time trend intelligence for brands and businesses, which The Semantic Web Blog previewed here, includes a dashboard featuring live visualization of all trending topics, hashtags and people, top positive and negative influences and sentiment trends, trending images, videos, links and popular messages, the ability to view trending messages by types (complaints vs. endorsements, for example) and real-time KPIs. As with its original service, Nerve Center leverages the company’s Sonar technology to automatically detect new topics and trends that matter to the enterprise.

“Broadly speaking, every large enterprise has to be doing social listening and social analytics,” CEO Nova Spivack told The Semantic Web Blog in an earlier interview, “including in realtime, which is one thing we specialize in. I don’t think any other product out there shows change as it happens as we do.” It’s important, he said, to understand that Bottlenose focuses on the discovery of trends, not just finding what users explicitly search for or track. Part of the release, he added, “will be some pretty powerful alerting to tell you when there is something to look at.”

Read more

Bottlenose Enterprise Wants To Be Your Artificial Analyst Team To Discover Trends And Insights

Bottlenose earlier this month raised $3.6 million in Series A funding to help with its launch of Bottlenose Enterprise, the upcoming tool aimed at helping large companies discover and visualize trends from among a host of data sources, measuring and comparing them for those with the most “trendfluence.” Users will get a realtime dynamic view of change as it happens and a host of analytics for automating insights, the company says.

The Enterprise edition will be a big departure from the current Bottlenose Lite version for individual professionals. That difference starts with the amount of data it can handle. “The free, Lite version looks only at public API data like Twitter’s. The enterprise version uses the firehose,” says CEO Nova Spivack. Another big difference is that the enterprise version adds a lot more views and analytics, in comparison to the personal-use edition, where its Sonar technology provides the chief service of real-time detection of talk around topics personalized to users’ interests so they can visualize and track those topics over time.

Spivack calls what Enterprise does “enterprise-scale trend detection in the cloud,” leveraging a massive Hadoop infrastructure and technologies including Cassandra, MongoDB, and the Storm distributed realtime computation system to process data for deep dives. The cloud handles the computation, and results are shared at the edge, where certain kinds of analytics and visualizations occur locally in the browser for a realtime expience with no latency. With sources such as social streams, stock information, even a company’s proprietary data, and more, the Enterprise version helps brands discover important trends like keywords to bid on or viral content to share, who are their influencers and detractors, what sentiment and demographic movements are taking shape, and to create correlations across data points, too.

Read more

Linked Data: Moving Towards Consumption

Earlier this month 16 out of 42 papers were accepted for the upcoming Linked Data on the Web (LDOW) 2012 Workshop in Lyon, France in April.

What might be discerned from the tenor of the submissions is something of a shift in focus in the Linked Data space, according to workshop chair Dr. Michael Hausenblas, Linked Data Research Centre, DERI, NUI Galway, Ireland. Other organizing committee members include Tim Berners-Lee, Christian Bizer and Tom Heath. “In 2008 to 2010 it was more like we were establishing the field, getting people to talk about what they do in terms of publishing and best practice around Linked Data, Open Linked Data and Linked Enterprise Data,” says Hausenblas. Now, with the web of Linked Data having grown to about 32 billion RDF triples last year, “we’re moving more towards the consumption – publishing is a necessary precondition but not an end in itself.”

Read more

Breaking into the NoSQL Conversation

Rob Gonzalez, Cambridge SemanticsSemantic Web Community: I’m disappointed in us!  Or at least in our group marketing prowess.  We have been failing to capitalize on two major trends that everyone has been talking about and that are directly addressable by Semantic Web technologies!  For shame.

I’m talking of course about Big Data and NoSQL.  Given that I’ve already given my take on how Semantic Web technology can help with the Big Data problem on SemanticWeb.com, this time around I’ll tackle NoSQL and the Semantic Web.

After all, we gave up SQL more than a decade ago.  We should be part of the discussion.  Heck, even the XQuery guys got in on the action early!

Check out this Google Trends diagram.

Semantic Web vs. NoSQL on Google Trends

Semantic Web vs. NoSQL on Google Trends

NoSQL came out of nowhere in 2009, and now dominates much of the database conversation on the web.  Document stores like MongoDB and CouchDB, distributed, key-value stores such as Riak and Cassandra, and other weird stores like Hadoop-as-database (never understood that usage myself) now dominate the conversation as the alternative to traditional, SQL databases.

Read more

NEXT PAGE >>