Introduction to: RDF vs XML

 There has always been a misconception between the relationship of RDF and XML. The main difference: XML is a syntax while RDF is a data model.

RDF has several syntaxes (Turtle, N3, etc) and XML is one of those (known as RDF/XML). Actually, RDF/XML is the only W3C standard syntax for RDF (Currently, there is Last Call on Turtle, a new W3C standard syntax for RDF). Therefore, comparing XML and RDF is like comparing apples with oranges. What can be compared is their data models. The RDF data model is a graph while the XML data model is a tree.

Comparing RDF with XML

Joshua Tauberer has an excellent comparison between RDF and XML, which I recommend. Two advantages of RDF are highlighted: flexibility of the data model and use of URIs as global unique identifiers.

Read more “Innovation Spotlight” Interview with Elliot Turner, CEO of AlchemyAPI.

If you would like your company to be considered for an interview please email editor[ at ]semanticweb[ dot ]com.

In this segment of our “Innovation Spotlight” we spoke with Elliot Turner (@eturner303), the founder and CEO of AlchemyAPI’s cloud-based platform processes around 2.5 billion requests per month. Elliot describes how their API helps companies with sentiment analysis, entity extraction, linked data, text mining, and keyword extraction.

Sean: Hi Elliot, thanks for joining us, how did AlchemyAPI get started?

Elliot: AlchemyAPI was founded in 2005 and in the past seven years has become one of the most widely used semantic analysis APIs, processing billions of transactions monthly for customers across dozens of countries.

I am the Founder and CEO and a serial entrepreneur who comes from the information security space.  My previous company built and sold high-speed network security appliances. After it was acquired, I started AlchemyAPI to focus on the problem of understanding natural human language and written communications.

Sean: Can you describe how your API works? What does it allow your customers to accomplish?

Elliot: Customers submit content via a cloud-based API, and AlchemyAPI analyzes that information in real-time, transforming opaque blobs of text into structured data that can be used to drive a number of business functions. The service is capable of processing thousands of customer transactions every second, enabling our customers to perform large-scale text analysis and content analytics without significant capital investment.

Read more

Linked Open Government Data: Dispatch from the Second International Open Government Data Conference

“What we have found with this project is… the capacity to take value out of open data is very limited.”

With the abatement of the media buzz surrounding open data since the first International Open Government Data Conference (IOGDC) was held in November 2011, it would be easy to believe that the task of opening up government data for public consumption is a fait accompli.  Most of the discussion at this year’s IOGDC conference, held July 10-12, centered on the advantages and roadblocks to creating an open data ecosystem within government, and the need to establish the right mix of policies to promote a culture of openness and sharing both within and between government agencies and externally with journalists, civil society, and the public at large.   According to these metrics the open government data movement has much to celebrate:  1,022,787 datasets from 192 catalogs in 24 languages representing 43 countries and international organizations.

The looming questions about the utility of open government data make it clear, however, that the movement is still in its early stages.    Much remains to be done to to provide usable, reliable, machine-readable and valuable government data to the public.

Read more

SemWebRox Community Challenge: Results

#SemWebRox ResultsThanks everyone for participating in the #SemWebRox Community Challenge!

Looking at the results (which have been pasted at the end of this article for convenience), I’m struck once again by the diversity of points of view in the Semantic Web community on what the key value of its technology really is. Over at Semantic University we summarized what we believe to be the two dominant camps (summary: AI-centric and flexible data management-centric) in the Semantic Web world, and the results of this exercise illustrate clearly that there are many nuances within those camps.

I’ll go into some highlights, but I think the why is still missing in many cases.  It’s the classic features-not-value predicament that plagues technologists and frustrates technology marketers.  We’re doing better, but we can and must do better still.

Data Flexibility: Data Integration

In terms of data flexibility, there are a number of themes that kept popping up.  Aaron Bradley first called out “cheaper enterprise data integration, and Lee Feigenbaum concurred by stating, “The Semantic Web is the only scalable approach for integrating diverse data.”  Another one I liked about data integration was from Abir: “Semantic Web technologies can make it possible to have true bottom-up web-scale automatic information integration.”

Read more

Dynamic Semantic Publishing for Beginners, Part 3

Even as semantic web concepts and tools are underpinning revolutionary changes in the way we discover and consume information, people with even a casual interest in the semantic web have difficulty understanding how and why this is happening.  One of the most exciting application areas for semantic technologies is online publishing, although for thousands of small-to-medium sized publishers, unfamiliar semantic concepts are too intimidating to grasp the relevance of these technologies. This three-part series is part of my own journey to better understand how semantic technologies are changing the landscape for publishers of news and information.  Read Part 2.


So far we’ve looked at the “cutting edge” of dynamic semantic publishing (BBC Olympics) and we’ve seen what tools large publishers such as the New York Times, Associated Press, and Agence France Press are using to semantically annotate their content.

And we’ve learned how semantic systems help publishers “Do More With Less”- that is, automate a lot of the work organizing content and identifying key concepts, entities, and subjects- and “Do More With More” – combine their content with related linked open data and present it in different contexts.

You may still be asking at this point, “What makes this so novel and cool?  We know that semantic tools save time and resources.  And some people say semantic publishing is about search optimization, especially after the arrival of Google’s Knowledge Graph.  But the implications of semantic publishing are about oh so much more than search.    What semantic systems are really designed for, to use the phrase attributed to Don Turnbull, is “information discovery” and, if semantic standards and tools are widely adopted in the publishing world, this could have huge implications for content and data syndication.

Read more

A Simple Tool in a Complex World: An Interview with Zemanta CTO Andraz Tori


Andraz Tori is the Owner and Chief Technology Officer at Zemanta, a tool that uses natural language processing (NLP) to extract entities within the text of a blog and enrich it with related media and articles from Zemanta’s broad user base.    This interview was conducted for Part 3 of the series “Dynamic Semantic Publishing for Beginners.”

Q. Although the term “Dynamic Semantic Publishing” appears to have come out of the BBC’s coverage of the 2010 World Cup, it looks as though Zemanta has been applying many of the same principles on behalf of smaller publishers since 2008.  Would you characterize it this way, or do you think that Zemanta is a more limited service with specific and targeted uses, while the platform built by BBC is its own semantic ecosystem?  How broadly should we define Dynamic Semantic Publishing?

A. What Zemanta does is empower the writer through semantic technologies. It’s like having an exoskeleton that gives you superpowers as an author. But Zemanta does not affect the post after it was written.   On the other hand dynamic semantic publishing is based on the premise of bringing together web pages piece-meal from a semantic database, usually in real time.

Read more

Google Knowledge Graph Interview

Vorhang aufGoogle’s Knowledge Graph has been the subject of lots of attention over the past few days since the announcement. And the focus of a lot of questions, too.

There’s been discussion on chat boards, for instance, about just who’s gotten access and who hasn’t. In a discussion with a representative from Google, The Semantic Web blog has learned that, like many other new Google services, the roll-out is gradual, in order to ensure the system is handling new functions well. First-come, first-served are those who are signed into Google – but then again, not everyone who is signed in. But the plan is to have everyone who’s signed in on board over the next few days, the rep says; so if you are and don’t have it yet, it should be hitting your browser shortly. Those not signed into Google accounts probably have a week or two of a wait left. So far, the rep said that things have been pretty smooth, so Google’s going at the pace it was hoping to.

Read more

Google’s Knowledge Graph Is No Ugly Duckling

I’m a fan of the waterfowl model of semantic technology. Clever semantics — as well as ‘advanced’ search boxes, arcane query syntax, and consumer interfaces that require user training — can paddle away as frantically as they like, but only while hidden well below the waterline. SPARQL, SKOS and SQL really shouldn’t be visible to most users of a web site. Ontologies and XML are enabling technologies, not user interface features.

With this week’s unveiling of the Knowledge Graph, Google has taken another step toward realising the potential of their Metaweb acquisition. The company has also clearly demonstrated its continued enthusiasm for delivering additional user value without requiring changes in user behaviour (well, except that those of us outside the US have to remember to use and not our local version, if we want to try this out).

For those who don’t remember, Metaweb was one of those companies that got people excited about the potential for semantic technologies to hit the big time. Founded way back in 2005, Metaweb attracted almost $60Million in investment for their “open, shared database of the world’s knowledge” (Freebase) before disappearing inside Google in 2010.

Read more

Global Accessibility Awareness Day is Today – but where’s the Sem Tech?

Global Accessibility Awareness Day LogoToday, May 9, 2012 is Global Accessibility Awareness Day (#GAAD). What started with a simple blog-post by Los Angeles Web Developer, Joe Devon, has grown to include events around the world designed to increase awareness about web accessibility issues. To read more about the day and these various activities, see the official GAAD Website and Facebook page.

According to the US Centers for Disease Control and Prevention, “Today, about 50 million Americans, or 1 in 5 people, are living with at least one disability, and most Americans will experience a disability some time during the course of their lives.” In other parts of the world, this number may be significantly higher.

In the interest of full disclosure, Joe Devon is a personal friend of mine, and I must admit that if he were not, I likely wouldn’t have seen his blog post or explored the issues of accessibility as deeply as I have in recent weeks. But I have been exploring, and I’ve been surprised at what I’ve found. In my opinion, Semantic Technology and Assistive Technology are a natural fit for one another, but there seems to be very little discussion or work around the intersection of the two. I have looked, but have not found much collaboration between the two communities. I have also found few individuals who possess much knowledge about both Semantic Tech and Assistive Tech. Of course, if I’ve missed something, please let me know in the comments!

Read more

Beyond Sentiment

[Editor’s Note: This guest post is by Tom Reamy, Chief Knowledge Architect and founder of KAPS Group, a group of knowledge architecture, taxonomy, and eLearning consultants. Tom has 20 years of experience in information architecture, intranet management and consulting, and education and training software.  Tom will be presenting a tutorial, Text Analytics for Semantic Applications and moderating a panel, Emotional Semantics – Beyond Sentiment at the upcoming SemTechBiz Conference in San Francisco.]

photo of Tom ReamyWhile sentiment analysis continues to generate a lot of press, it is not clear how much real value organizations are deriving from it.  One reason for that is that the standard approach to sentiment has been mostly statistical and/or long lists of sentiment terms.  However, if you add in other, advanced text analytics capabilities such as auto-categorization using advanced operators, you can not only develop more sophisticated sentiment analysis, you can also develop a whole new class of applications that either enhance and/or go beyond simple sentiment analysis.

These advanced operators include such commands as DEST_6 (count two words as a positive indicator only if they are with 6 words of each other) or SENT (only count words in the same sentence).

Read more