Rob Gonzalez

SemWebRox Community Challenge: Results

#SemWebRox ResultsThanks everyone for participating in the #SemWebRox Community Challenge!

Looking at the results (which have been pasted at the end of this article for convenience), I’m struck once again by the diversity of points of view in the Semantic Web community on what the key value of its technology really is. Over at Semantic University we summarized what we believe to be the two dominant camps (summary: AI-centric and flexible data management-centric) in the Semantic Web world, and the results of this exercise illustrate clearly that there are many nuances within those camps.

I’ll go into some highlights, but I think the why is still missing in many cases.  It’s the classic features-not-value predicament that plagues technologists and frustrates technology marketers.  We’re doing better, but we can and must do better still.

Data Flexibility: Data Integration

In terms of data flexibility, there are a number of themes that kept popping up.  Aaron Bradley first called out “cheaper enterprise data integration, and Lee Feigenbaum concurred by stating, “The Semantic Web is the only scalable approach for integrating diverse data.”  Another one I liked about data integration was from Abir: “Semantic Web technologies can make it possible to have true bottom-up web-scale automatic information integration.”

Read more

Community Challenge: The Semantic Web in 140 Characters

Community Challenge: Semantic Web in 140 CharactersAs a community, we Semantic Webbers have done a poor job communicating our value clearly and concisely.

Last week, I stated the case in more detail at the Enterprise Semantics Blog, and Aaron Bradley continued it over at Google Plus (here and here).

Today, I bring you a challenge.

Describe a value of the Semantic Web clearly in 140 characters. Tweet it with the hashtag #SemWebRox.

Why a value and not the value? Because different people have different opinions on what the most important facet of the Semantic Web is. And since you can’t have more than one most important value, just stick to one, and make it convincing.

Why 140 characters? It’s not just Twitter. Restricting space in this way forces you to get the core of your argument. No elaboration. No amendments. Straight value.

Here is my attempt:

Write yours here or tweet it directly. We’ll aggregate them and include in a future post!

Breaking into the NoSQL Conversation

Rob Gonzalez, Cambridge SemanticsSemantic Web Community: I’m disappointed in us!  Or at least in our group marketing prowess.  We have been failing to capitalize on two major trends that everyone has been talking about and that are directly addressable by Semantic Web technologies!  For shame.

I’m talking of course about Big Data and NoSQL.  Given that I’ve already given my take on how Semantic Web technology can help with the Big Data problem on, this time around I’ll tackle NoSQL and the Semantic Web.

After all, we gave up SQL more than a decade ago.  We should be part of the discussion.  Heck, even the XQuery guys got in on the action early!

Check out this Google Trends diagram.

Semantic Web vs. NoSQL on Google Trends

Semantic Web vs. NoSQL on Google Trends

NoSQL came out of nowhere in 2009, and now dominates much of the database conversation on the web.  Document stores like MongoDB and CouchDB, distributed, key-value stores such as Riak and Cassandra, and other weird stores like Hadoop-as-database (never understood that usage myself) now dominate the conversation as the alternative to traditional, SQL databases.

Read more

Down with the Data Warehouse! Long Live the Semantic Data Warehouse!

East wall of Courtyard brick work, construction of the McKim BuildingI had a call with a Fortune 100 IT team that is looking at using semantic technology as an alternative to the Data Warehouse.  This is my favorite kind of conversation, since I firmly believe the traditional data warehouse is dead but just doesn’t know it yet.

This is the situation the IT team explained:

We need to aggregate information and present it to the user, so we build a warehouse.  We spend all this time building and designing the warehouse, and when it’s done they need something else.  Unfortunately, it’s not so easy to modify a warehouse once it’s running, so we build another one.  And then another.  The cycle has been repeating itself for years and is not sustainable.

Philadelphia Spectrum demolition: brick by brick

The alternative to warehousing is Data Virtualization (EII, Data Federation…lots of terms for it)…or, at least that’s what they, and many others, see!  Essentially, they have been burned by years of working with an inflexible technology, so are looking to dump the approach all together.

I get this.  If a Durian is the only fruit you’ve ever smelled, you’d think all fruit were really stinky.

Read more

Two Kinds of Big Data

Rob Gonzalez, Cambridge SemanticsWith all the hullabaloo around Big Data, I’ve been a little surprised that there hasn’t been more talk about how to consume the vast petabytes that people are talking about…until I realized that there are really two Big Data problems out there!

ReceiptsRoughly speaking, the two primary ways in which data scales is by adding depth and by adding breadth.  The first is what most people mean when they refer to Big Data.  Want to run analytics on every single transaction that Wal*Mart has done over 10 years to analyze trends?  THAT is vertical scale.  Technically, you can characterize it as having lots and lots of similarly structured data.  That is where technologies like Hadoop and column-based data storage make a big difference.

Horizontal Big Data, on the other hand, is like the Linked Data Cloud.   It has all kinds of random information that ranges from highly structured and numeric to highly unstructured.  Significantly, it tends to change quite a bit over time with increasing heterogeneity.  That’s a completely different kind of scale, and one that is not well solved by using highly structured, vertically scaling technologies.

Read more

Don’t Call Me a Cartographer…

SignpostWhen I first learned to drive, I could only navigate through my town by way of my parents’ house.  If I wanted to go from my high school to a soccer field, I’d have to go by way of my house, even if it took twice as long, since it was the only way I knew to go.  It was a hub-and-spoke system with one hub through which I passed to go anywhere.  Even today, my parents still joke with me that I can get lost in my own house…and I wish I could disagree with them.

So, in theory I understand and sympathize with why people interested in the Semantic Web and ontologies care about ontology visualization tools; you know: those nifty graphical ontology explorers that center on whatever node you’re looking at and allow you to follow edges of the graph and move things around.  When you’re dealing with lots of classes, such views help you find your way around the chaos and not get lost.  And I love not getting lost.

In practice, however, I have yet to see a real use case for one.  In all the implementations of Semantic Web-based solutions that we’ve done over the years, starting back in 2003 at IBM, I’ve yet to see a legitimate, real-world use case where an ontology visualizer is used for something more than a pretty demo.

Read more

I’ve got a Federated Bridge to Sell You (A Defense of the Warehouse)

[Editor’s Note: At the Semantic Web Summit conference in Boston in November, a discussion arose around Federated Data vs. Data Warehousing.  Rob Gonzalez of Cambridge Semantics raised some very interesting points that I asked him to expand on in the post below.  And whether you agree or disagree, we want to hear from you.

Bridge for sale - 1/2 off!The Semantic Web dream of data federation is awesome.  You type in a query, and magical, intelligent agents scurry all over the datasphere, bringing back information to give you a complete, up-to-date, correct answer to your question.  No need for a messy, time-consuming datamart project!  What’s not to love, right?

Eric asked me to write this piece, and so I find myself in the unenviable position of having to tell you, dear reader and Semantic Web fan, that there is no Santa Claus (of data federation).  I’d like to make a case for the continued need for data consolidation in datamarts—yes, even in the Semantic Web-world—to gain real value from your enterprise data. Read more