SemTechBiz SF more TVNewser TVSpy LostRemote SocialTimes AllFacebook AllTwitter GalleyCat AppNewser UnBeige AgencySpy PRNewser 10,000 Words FishbowlNY FishbowlLA FishbowlDC MediaJobsDaily

Posts Tagged ‘NoSQL’

Graph Database Adoption on the Rise

Emil Eifrem, CEO of Neo Technology recently wrote an article highlighting the recent rise in graph database adoption. He writes, “Graph databases are the most scalable, high performance way to query and store highly interconnected data. They help improve intelligence, predictive analytics, social network analysis, decision and process management – which all involve highly connected data with lots of relationships. A relevant use case for graph databases is the social graph. The social graph leverages information across a range of networks to understand the relationships between individuals. Facebook, LinkedIn and Amazon are all examples of companies that derived tremendous value from leveraging social and professional graphs and providing a deeper analysis of the data they collect every day. The biggest challenge that companies face is the ability to handle the exponential growth and massive connected data challenges associated with the social graph.” Read more here. Read more

SindiceTech Helps Enterprises Build Private Linked Data Clouds

Last week The Semantic Web Blog covered the launch of the SindiceTech Assisted SPARQL Editor as an open source project, noting that SparQLed also is part of SindiceTech’s commercial suite for large enterprises building private linked data clouds. This week, we’ll dive a little deeper into SindiceTech and its progress since the founders of the Sindice web of data search engine turned their attention to focusing on the commercial application of its technology as a real-time semantic warehousing infrastructure, which leverages cloud computing for integrating and normalizing the massive amounts of data the enterprise must deal with.

 

As SindiceTech founder and CEO Giovanni Tummarello explains, companies actually approached his team to help them make a reality of their visions to use RDF and SPARQL, as the best knowledge representation and querying technologies available, by providing the missing scalability and stability. Sindice.com was evidence that the technology the team had developed could answer these enterprises’ needs; currently there are about 700 million semantically marked-up web pages indexed in the Sindice.com search engine, with a live updated index of some 80 billion triples daily. Its database is over 5 terabytes.

Read more

A Simple Tool in a Complex World: An Interview with Zemanta CTO Andraz Tori

 

Andraz Tori is the Owner and Chief Technology Officer at Zemanta, a tool that uses natural language processing (NLP) to extract entities within the text of a blog and enrich it with related media and articles from Zemanta’s broad user base.    This interview was conducted for Part 3 of the series “Dynamic Semantic Publishing for Beginners.”

Q. Although the term “Dynamic Semantic Publishing” appears to have come out of the BBC’s coverage of the 2010 World Cup, it looks as though Zemanta has been applying many of the same principles on behalf of smaller publishers since 2008.  Would you characterize it this way, or do you think that Zemanta is a more limited service with specific and targeted uses, while the platform built by BBC is its own semantic ecosystem?  How broadly should we define Dynamic Semantic Publishing?

A. What Zemanta does is empower the writer through semantic technologies. It’s like having an exoskeleton that gives you superpowers as an author. But Zemanta does not affect the post after it was written.   On the other hand dynamic semantic publishing is based on the premise of bringing together web pages piece-meal from a semantic database, usually in real time.

Read more

Linked Data: Moving Towards Consumption

Earlier this month 16 out of 42 papers were accepted for the upcoming Linked Data on the Web (LDOW) 2012 Workshop in Lyon, France in April.

What might be discerned from the tenor of the submissions is something of a shift in focus in the Linked Data space, according to workshop chair Dr. Michael Hausenblas, Linked Data Research Centre, DERI, NUI Galway, Ireland. Other organizing committee members include Tim Berners-Lee, Christian Bizer and Tom Heath. “In 2008 to 2010 it was more like we were establishing the field, getting people to talk about what they do in terms of publishing and best practice around Linked Data, Open Linked Data and Linked Enterprise Data,” says Hausenblas. Now, with the web of Linked Data having grown to about 32 billion RDF triples last year, “we’re moving more towards the consumption – publishing is a necessary precondition but not an end in itself.”

Read more

Breaking into the NoSQL Conversation

Rob Gonzalez, Cambridge SemanticsSemantic Web Community: I’m disappointed in us!  Or at least in our group marketing prowess.  We have been failing to capitalize on two major trends that everyone has been talking about and that are directly addressable by Semantic Web technologies!  For shame.

I’m talking of course about Big Data and NoSQL.  Given that I’ve already given my take on how Semantic Web technology can help with the Big Data problem on SemanticWeb.com, this time around I’ll tackle NoSQL and the Semantic Web.

After all, we gave up SQL more than a decade ago.  We should be part of the discussion.  Heck, even the XQuery guys got in on the action early!

Check out this Google Trends diagram.

Semantic Web vs. NoSQL on Google Trends

Semantic Web vs. NoSQL on Google Trends

NoSQL came out of nowhere in 2009, and now dominates much of the database conversation on the web.  Document stores like MongoDB and CouchDB, distributed, key-value stores such as Riak and Cassandra, and other weird stores like Hadoop-as-database (never understood that usage myself) now dominate the conversation as the alternative to traditional, SQL databases.

Read more

Report from Day 2 at ISWC

Juan Sequeda photo [Editor's Note: This week, Juan Sequeda is reporting in from the International Semantic Web Conference in Bonn, Germany. See his other reports here:
Day 1 | Day 2 | Day 3 | Day 4 | Day 5 ]

Day 2 of ISWC consisted of 7 workshops and 3 tutorials. One of the most popular workshops was the Ontology Matching, which seems to be evolving to not only matching ontologies but also to matching instances, due to the rise of Linked Data. The Scalable Semantic Web Knowledge Base Systems presented several works on RDF and NoSQL databases, such like cumulusRDF.

Read more

When it Comes to Data Management on the Semantic Web, HBase has the Edge

Researchers at the University of Texas – Pan American have found that HBase “has the edge in data management for next generation Internet and cloud computing users.” The article states, “An open-source, non-relational database written in Java that can scale to thousands of servers, HBase makes many features of Google’s proprietary, high-performance distributed storage system BigTable available to the programming community. It also features a fail-safe library that runs ‘on top of’ a server cluster — a global architecture that detects and handles failures at the local level before they spread.” Read more

What One Trillion Means for the Semantic Web

Mitchell Shults commented on the significance of Franz’s recent success loading one trillion triplestores. Shults writes, “Triplestores are perfect for making sense out of extremely complex data. However, a triplestore is only useful if massive quantities of information can be loaded, updated and effectively queried in a reasonable amount of time. That is why Franz Technology’s announcement is so interesting.” Read more

Franz’s NoSQL Database Successfully Loads 1 Trillion RDF Triples

Franz’s NoSQL database, AllegroGraph has become the first NoSQL database to load over one trillion RDF Triples, a feat that is being called “a major step forward in scalability for the Semantic Web.” According to the article, “A trillion RDF Statements eclipses the current state of the art for the Semantic Web data management but is a primary interest for companies like Amdocs that use triples to represent real-time knowledge about telecom customers. Per-customer, Amdocs uses about 4,000 triples, so a large telecom like China Mobile would easily need 2 trillion triples to have detailed knowledge about each single customer.” Read more

Native XML Databases and RDF

Royal Enfield sidecarThere are three trends that I observed at SemTech 2011 in San Francisco last week.  First was the increased role of native XML databases used in combination with RDF data stores.  Second was the many natural-language processing tools and vendors at the conference.  And third was the role of semantic annotations and standards directly in web content.  I think these trends are related.

One of the keynote presentations at the SemTech 2011 conference was done by the BBC.  They presented their core architecture for managing web content as having two main components: a native XML database(MarkLogic)  for content and a RDF triple store for “metadata.”  These tools were at the core of their architecture for their web sites.

Another presentation was done by the Mayo Clinic.  They also are using MarkLogic for web content and are also using semantic web technologies.  Their diagrams show that there are many ways for these systems to interact.

Read more

<< PREVIOUS PAGE