Thanks everyone for participating in the #SemWebRox Community Challenge!
Looking at the results (which have been pasted at the end of this article for convenience), I’m struck once again by the diversity of points of view in the Semantic Web community on what the key value of its technology really is. Over at Semantic University we summarized what we believe to be the two dominant camps (summary: AI-centric and flexible data management-centric) in the Semantic Web world, and the results of this exercise illustrate clearly that there are many nuances within those camps.
I’ll go into some highlights, but I think the why is still missing in many cases. It’s the classic features-not-value predicament that plagues technologists and frustrates technology marketers. We’re doing better, but we can and must do better still.
Data Flexibility: Data Integration
In terms of data flexibility, there are a number of themes that kept popping up. Aaron Bradley first called out “cheaper enterprise data integration, and Lee Feigenbaum concurred by stating, “The Semantic Web is the only scalable approach for integrating diverse data.” Another one I liked about data integration was from Abir: “Semantic Web technologies can make it possible to have true bottom-up web-scale automatic information integration.”
Even as semantic web concepts and tools are underpinning revolutionary changes in the way we discover and consume information, people with even a casual interest in the semantic web have difficulty understanding how and why this is happening. One of the most exciting application areas for semantic technologies is online publishing, although for thousands of small-to-medium sized publishers, unfamiliar semantic concepts are too intimidating to grasp the relevance of these technologies. This three-part series is part of my own journey to better understand how semantic technologies are changing the landscape for publishers of news and information. Read Part 2.
So far we’ve looked at the “cutting edge” of dynamic semantic publishing (BBC Olympics) and we’ve seen what tools large publishers such as the New York Times, Associated Press, and Agence France Press are using to semantically annotate their content.
And we’ve learned how semantic systems help publishers “Do More With Less”- that is, automate a lot of the work organizing content and identifying key concepts, entities, and subjects- and “Do More With More” – combine their content with related linked open data and present it in different contexts.
You may still be asking at this point, “What makes this so novel and cool? We know that semantic tools save time and resources. And some people say semantic publishing is about search optimization, especially after the arrival of Google’s Knowledge Graph. But the implications of semantic publishing are about oh so much more than search. What semantic systems are really designed for, to use the phrase attributed to Don Turnbull, is “information discovery” and, if semantic standards and tools are widely adopted in the publishing world, this could have huge implications for content and data syndication.
Andraz Tori is the Owner and Chief Technology Officer at Zemanta, a tool that uses natural language processing (NLP) to extract entities within the text of a blog and enrich it with related media and articles from Zemanta’s broad user base. This interview was conducted for Part 3 of the series “Dynamic Semantic Publishing for Beginners.”
Q. Although the term “Dynamic Semantic Publishing” appears to have come out of the BBC’s coverage of the 2010 World Cup, it looks as though Zemanta has been applying many of the same principles on behalf of smaller publishers since 2008. Would you characterize it this way, or do you think that Zemanta is a more limited service with specific and targeted uses, while the platform built by BBC is its own semantic ecosystem? How broadly should we define Dynamic Semantic Publishing?
A. What Zemanta does is empower the writer through semantic technologies. It’s like having an exoskeleton that gives you superpowers as an author. But Zemanta does not affect the post after it was written. On the other hand dynamic semantic publishing is based on the premise of bringing together web pages piece-meal from a semantic database, usually in real time.
Google’s Knowledge Graph has been the subject of lots of attention over the past few days since the announcement. And the focus of a lot of questions, too.
There’s been discussion on chat boards, for instance, about just who’s gotten access and who hasn’t. In a discussion with a representative from Google, The Semantic Web blog has learned that, like many other new Google services, the roll-out is gradual, in order to ensure the system is handling new functions well. First-come, first-served are those who are signed into Google – but then again, not everyone who is signed in. But the plan is to have everyone who’s signed in on board over the next few days, the rep says; so if you are and don’t have it yet, it should be hitting your browser shortly. Those not signed into Google accounts probably have a week or two of a wait left. So far, the rep said that things have been pretty smooth, so Google’s going at the pace it was hoping to.
I’m a fan of the waterfowl model of semantic technology. Clever semantics — as well as ‘advanced’ search boxes, arcane query syntax, and consumer interfaces that require user training — can paddle away as frantically as they like, but only while hidden well below the waterline. SPARQL, SKOS and SQL really shouldn’t be visible to most users of a web site. Ontologies and XML are enabling technologies, not user interface features.
With this week’s unveiling of the Knowledge Graph, Google has taken another step toward realising the potential of their Metaweb acquisition. The company has also clearly demonstrated its continued enthusiasm for delivering additional user value without requiring changes in user behaviour (well, except that those of us outside the US have to remember to use google.com and not our local version, if we want to try this out).
For those who don’t remember, Metaweb was one of those companies that got people excited about the potential for semantic technologies to hit the big time. Founded way back in 2005, Metaweb attracted almost $60Million in investment for their “open, shared database of the world’s knowledge” (Freebase) before disappearing inside Google in 2010.
Today, May 9, 2012 is Global Accessibility Awareness Day (#GAAD). What started with a simple blog-post by Los Angeles Web Developer, Joe Devon, has grown to include events around the world designed to increase awareness about web accessibility issues. To read more about the day and these various activities, see the official GAAD Website and Facebook page.
According to the US Centers for Disease Control and Prevention, “Today, about 50 million Americans, or 1 in 5 people, are living with at least one disability, and most Americans will experience a disability some time during the course of their lives.” In other parts of the world, this number may be significantly higher.
In the interest of full disclosure, Joe Devon is a personal friend of mine, and I must admit that if he were not, I likely wouldn’t have seen his blog post or explored the issues of accessibility as deeply as I have in recent weeks. But I have been exploring, and I’ve been surprised at what I’ve found. In my opinion, Semantic Technology and Assistive Technology are a natural fit for one another, but there seems to be very little discussion or work around the intersection of the two. I have looked, but have not found much collaboration between the two communities. I have also found few individuals who possess much knowledge about both Semantic Tech and Assistive Tech. Of course, if I’ve missed something, please let me know in the comments!
[Editor's Note: This guest post is by Tom Reamy, Chief Knowledge Architect and founder of KAPS Group, a group of knowledge architecture, taxonomy, and eLearning consultants. Tom has 20 years of experience in information architecture, intranet management and consulting, and education and training software. Tom will be presenting a tutorial, Text Analytics for Semantic Applications and moderating a panel, Emotional Semantics - Beyond Sentiment at the upcoming SemTechBiz Conference in San Francisco.]
While sentiment analysis continues to generate a lot of press, it is not clear how much real value organizations are deriving from it. One reason for that is that the standard approach to sentiment has been mostly statistical and/or long lists of sentiment terms. However, if you add in other, advanced text analytics capabilities such as auto-categorization using advanced operators, you can not only develop more sophisticated sentiment analysis, you can also develop a whole new class of applications that either enhance and/or go beyond simple sentiment analysis.
These advanced operators include such commands as DEST_6 (count two words as a positive indicator only if they are with 6 words of each other) or SENT (only count words in the same sentence).
Are you wondering why your product pages don’t stand out in search results like those from Amazon (shown below) or other competing e-commerce websites? These expanded results are commonly known as Rich Snippets (as named by Google) and are the result of having your HTML structured correctly with semantic markup. Whether you’re savvy to HTML5 and the latest design trends, or you haven’t updated your website code in years, this is article will explain why it’s important you structure your data properly utilizing semantic standards.
There are a number of ways to structure your data to make it more relevant to search engines, as well as social media sites. As an e-commerce retailer it is important to understand which of these standards you should consider including in your website. You should take some time to ensure you are implementing semantic markup, and doing it correctly. It has the power to better inform potential customers with upfront knowledge prior to landing on your site. Customers can see product reviews, pricing and stock information, and even images before clicking through to your website. This can lead to increased click-through rates, improve conversions, and generally enhance your SEO objectives.
NOTE: This post is provided by guest author, Mr. Dennis E. Wisnosky, Chief Technical Officer and Chief Architect, Business Mission Area, U.S. Department of Defense. Dennis will be delivering a Special Presentation, “The Enterprise Information Web: Analytics, Efficiency and Security” at the June SemTechBiz Conference.
Semantic Technology brings a number of unique capabilities to data stores and applications. These capabilities evidence themselves both at the user interaction level, in what users can do with and expect from Semantic technologies; and at the system level, in terms of things applications can do internally without rework or recoding. Semantic Technology, based upon W3C standards, provides capabilities significantly beyond those of proprietary approaches based on technologies that were founded a half century earlier.
1. User Interaction Capabilities
Access to Meaning
Semantic Technology is based upon the development of the ontology of a particular domain. That is, “what do I need to know to have an unambiguous understanding of a particular thing, organization, subject, etc.?” This knowing is based upon precise understanding of the meaning of words used in the domain. A Semantic-Technology-based application depends on and provides a user with access to the defined meaning of the terms—the vocabulary, the words—used in the application. This means access to a human-only readable definition, such as one found in a dictionary, and access to the formalized definition found in the ontology that frames the system which executes the application. Such access should be presented in a human consumable form, and is one of the areas in which various formalisms such as Controlled Natural Language (CNL) are useful for translating technical forms of ontologies, such as the Web Ontology Language (OWL) , a W3C standard, to provide a human consumable form.