Archives: June 2008

Microsoft buying into Semantics – part 1

The news yesterday that Microsoft is likely buying semantic search provider Powerset had those of us in the community buzzing.  Besides the valuation per se this event provides several thoughts about the maturity of our technology, its value and its future. 

Read more

A New Paradigm

Jennifer Zaino Contributor

Paradigm5, a new semantic web-based business networking tool, launched June 16, and the next day at the Linked Data Planet conference in New York City, Marcus Trevisani, CTO at Relevant Digital, talked about the new service. is powered by the semantic technology company’s Discovery Engine. Relevant Digital’s Discovery Engine is billed as a matching technology that enables users to find, retrieve and use data on the Internet and corporate enterprise databases more effectively.

“We need a way to represent resources in a fundamentally common way, and to identify resources whatever they may be,” said Trevisani. “With the advent of the semantic web, linked data and RDF triples, there’s a perfect way to be able to do that.”

The issue with broad-scale semantic search as it typically exists, he said, is that there’s too much information, making the task time-intensive and laborious.

“Search involves your full attention, you are locked in front of that engine spending time trying to get results,” he said. “Wouldn’t it be better if there were some system that could do that for you? That you would broadcast some request and the system would go out and find it, tell you when it has found it, bring it back to you without your having to sit there and be captive to that process? That’s what semantic discovery is – systems that perform semantic resource matching based on resources or requests and [that issue] recommendations on a continuous basis. It’s an agent that continually monitors its environment. You end up getting asynchronous, continuous and relevant action.”

Trevisani contends that many businesses can be built from this model – pushing ahead the concept of monetizing the semantic web, which hasn’t been fully realized yet. That includes enterprise collaboration – for example, applications could be built to retain the knowledge of laid-off workers in this age of merger and acquisition activity.

“M&A…it’s usually a cut-and-run attempt to reduce staff. Then they realize they lost a lot of knowledge,” he said. “If there was an application that could take the profile of every user, the documents they worked on, and index them and then have them stored continually in the system and active, that knowledge is never lost. Even if they are laid off, you can ask them questions related to business transactions. It gives you the ability to avoid a lot of the inefficiencies that come about when you merge companies together.”

Social networking applications can also be built on top of this model, enabling comparison of profiles between individuals for shared identities and interests.

“By identifying semantic objects and resources in such a way that they can interoperate, you start to provide solutions to many different industries,” he said.

Read more

Tagging and the Semantic Web

John Clarke Mills Contributor

A while back I commented on a TechCrunch article quoting Twine CEO Nova Spivack regarding keyword searches in the Semantic Web space. My comment was later quoted on the Faviki blog, a semantic startup involving tagging web pages with semantic Wikipedia data. I thought it would be useful here to go into a little bit more depth on semantic tagging and what we’ve learned thus far.

Tags the way they are implemented today

The way the better Web 2.0 sites implement tags involves faceting. In a nutshell, it allows you to group together documents or objects based on attributes. For example, a collection of all documents about ‘George Bush’ and ‘Washington’. The problem with these attributes is they have little or no value on their own and they certainly are not understood by computers. They are just strings denoting some type of concept. To that end, here is a short list of limitations that the Semantic Web will address:

* Tags do not provide enough meaningful metadata to make meaningful comparisons.

* More information is needed besides their origin.

* Tags are essentially a full text search mechanism, although faceting helps.

* Need more relationships between tags and the objects they pertain to.

The solution: Tags as objects

Allowing users to tag an object with another object allows us to make extremely interesting comparisons; discerning a lot more information about the original object becomes simple and accurate. With this type of interrelationship we can pivot through the data like never before, not with full text search but with object graph linkages that machines and humans can understand.

Let’s go over an example:

Let’s say a user adds a note into our system ranting about a beet farmer who lives in Washington state by the name of William Gates. The user goes on to discuss his beets and farming techniques in great detail, mentioning nothing about software and Windows Vista, of course. In the current Internet model the user would tag this note with strings like, ‘William Gates’, ‘Bill Gates’, ‘beets’, etc.

Now another user comes along and starts digging through documents tagged ‘Bill Gates’ to try and find new articles about Vista. Unfortunately, many searches will turn up bad results, especially if the density of the word ‘Bill Gates’ is great enough in the document about beets. That being said, the other direction would work more as intended, searching on the tags ‘Bill Gates’ and ‘Beets’ would yield more expected results.

In the Semantic Web model, the document about William Gates (the beet farmer) would be tagged with the William Gates object that could contain a plethora of metadata, including his location, occupation, etc. Now when we look at this document there is no guessing as to what it is referring, especially from a machine’s point of view. This is exactly what the Semantic Web was built for. In this model we are not relying on linguistics, natural language processing, or full text search. We are relying on hard links that machines can understand and relate to.

Read more

Berners-Lee Leads the Way on Linked Data

Jennifer Zaino Contributor

NEW YORK — At the LinkedData Planet Conference & Expo here this week, Sir Tim Berners-Lee talked about linking data using open standards, both in his keynote and in a press conference before his speech. Or not freeing data, as long as you make that decision because you don’t want to share the data, not because you can’t.

Berners-Lee made a number of interesting points around that idea, and commented on some other trends that will accompany the development of the semantic web. Here are some of the highlights of his analysis:

On linking data using open standards in the enterprise

Most companies struggle with the problem of trying to look across disparate silos of information every time they have to react to an event. “Within a company, within the firewall, you need to be able to access [that data]. If I start off talking to a CIO about using open standards to interrelate with other companies — I get back, don’t even go there. Inside my company we have all these data pipes. We need to [link] it internally before we integrate it up and down the supply chain.

“Then you make a business decision about what data you want to share. That said, the technology doesn’t mean you are forced to share everything. You can keep a lot of stuff behind the firewall. But, you may find that business runs better if you do share a lot with your partners,” Berners-Lee said.

On a data browser

“We don’t have a generic data browser,” he said, noting, however, that MIT has a project called Tabulator, a Firefox extension that attempts to be a generic browser for linked data on the web. “Instead of looking at pages, it looks at things,” and what is related to them, finding patterns.

For example, you might find your town in DBpedia, and then a singer on Musicbrains who was born in that town, and then an album that singer has brought out, and then more singers with albums who hail from towns close to your own. “It lets you go between exploring the web and using it [like] a spreadsheet basis. That’s the sort of thing I’d like to be able to use as a generic data browser. When we have linked data, more and more we will find that a good user interface is a tremendous research challenge, and there will be a huge competitive market to allow a user to get the most of it.”

On the social, ethical consequences of proliferating personal data

Most people involved in the semantic web are building programs with a strong awareness of the issue of data trust, Berners-Lee said, noting that a lot of people are worried about the provenance of data and how much to trust it. But perhaps the bigger issue is that there are lots of cases where you can’t keep data locked up — for example, in its migration across social networking sites. So “social networking sites have to track what you wanted other people to use that data for. There has to be more accountability and more of a concept of acceptable use,” he said.

Read more

Firefox 3: The Semantic Web Browser?

Sean Michael Kerner Contributor

The Semantic Web is a lofty idea intended to help connect sources of information and make sense of them. But how do you actually access the Semantic Web?

If the Semantic Web turns out to be anything like the Web we all know and use today, then the gateway to the Semantic Web will be the trusty Web browser.

At present, Mozilla may be leading the major browser vendors in bringing semantics to everyday Web browsing, courtesy of tools built into its upcoming Firefox 3 Web browser that. Microsoft’s Internet Explorer 8 (IE8) may not be far behind, however.

Both browsers are working to include some measure of support for microformats — a simple means of categorizing Web content as metadata.

“Firefox 3′s microformats API and support for detecting different types of content inside of RSS feeds are both important steps in the direction of creating a Semantic Web browser,” Alex Faaborg, Mozilla’s user experience designer, told

Microformats, defined by the technology’s community site as “small bits of HTML that represent things like people, events, tags, etc. in Web pages,” represent a lightweight means of bringing semantics to the Web.

Last year, Faaborg told that microformats can be thought of as the “lowercase Semantic Web,” since they are less expressive but less complex than Resource Description Framework, also known as RDF, or Web Ontology Language, also called OWL. RDF and OWL are techniques for describing and organizing Semantic Web information.

In Firefox 2, microformats had been enabled by way of the Operator extension, which was developed by IBM’s Michael Kaply. With the upcoming release of Firefox 3, microformats are more tightly integrated with the core browser, by way of an API for accessing microformatted content on a Web page.

Mozilla is also pushing the development of microformat-enabled content with no less than eight articles in the Mozilla Developer Center documenting microformat support in Firefox 3.

“I worked with Michael Kaply on the initial versions of his popular ‘Operator’ add-on, which detects and displays microformatted content in pages, and since then, he’s really built on top of it and turned it into a fantastic and useful Firefox add-on,” Faaborg said.

“Given the vibrant extension development community that Firefox has, we expect to see a variety of innovative extensions making use of this API over the lifecycle of Firefox 3 and beyond,” he added.

Faaborg also said Kaply developed add-ons that use Firefox 3′s microformats API to detect and display microformat technologies likely to be popularized through IE8. Microsoft’s
makes use of microformats by way of two technologies — WebSlices and Activities — that enable site developers to more easily pull in third-party content.

Though IE8 does have some semantic capabilities, it’s unclear whether Microsoft currently considers IE a “Semantic Web browser.”

“The Internet Explorer team is serious about enabling Web developers to be the most effective and efficient as possible,” a Microsoft spokesperson said in an e-mail to

However, the spokesperson was unable to comment further on the company’s current activities or plans.

“Microsoft does not have anything to share at this time regarding Semantic Web standards,” the spokesperson said.

Read more

Berners-Lee Talks Up Linked Open Data Movement

Erin Joyce Contributor

NEW YORK — Data isn’t worth much until it’s free — freed from the silo it’s locked up in, and used in a mashup that creates valuable new resources for you and others. Freeing data is also behind a fast-growing movement around Linked Open Data — or what many call Web 3.0 for short, said the founder of the World Wide Web.

During a keynote address at the Linked Data Planet conference here, Sir Tim Berners-Lee stumped for the next vision of the Web – dubbed Web 3.0 — and the linked open data movement that is behind the forming Semantic Web.

Learn how the Semantic Web is changing the way we treat data at the LinkedData Planet Conference.

“Linked open data is a movement,” he said. “It’s a movement that has taken off internationally; it’s a grass roots movement, and it’s about information that is free to use in the Linked Data format.”

This doesn’t mean all data should and will be free — you decide what’s open and in the public realm and what stays behind a firewall, he stressed. But the decision not to trade data should be because you don’t want to, and not because your
data just doesn’t understand the other party’s. That’s the fundamental part of the Linked Open Data movement he discussed with attendees at the conference, which was sponsored by Jupitermedia, the parent company of this site.

Web 3.0, Semantic Web — even Linked Data, is “about simple ideas that make the Web work and using them for data. But it’s about getting one format across applications so the Semantic Web standards enable me looking at my bank statement. Now I can drop that into my calendar and do something with it,” he added.

The Web as it is currently architected can’t do that now. “We all want to do stuff with data. Let’s get it on the Web and do stuff with it, and have one standard for doing that. Linked data is a very simple set of rules of putting [this data] on the Web,” he continued.

The Semantic Web in action

If you’re selling printers, for example, customers should be able to look up the Uniform Resource Identifier URI and pull up data about that printer. But that’s just a small part of the Semantic Web.

Back when the Web first came into wide use, people were amazed that they could just go to a store online, he noted. With Web 3.0, the shopping can be done for you, based on specific parameters you assign your data, as well as boundaries with how it can be used, and by who.

So as you work on getting data tagged, and when working with Semantic Web Ontologies, deploy open standards and hew to the existing best practices when creating ontologies, he added.

He urged attendees to look over their data, take inventory of it, and decide on which of the things you’d most likely get some use out of re-using it on the Web. Decide priorities, and benefits of that data reuse, and look for existing ontologies on the Web on how to use it, he continued, referring
to the term that describes a common lexicon for describing and tagging data.

More than anything, don’t change the way you’ve worked with the data, and work on hewing to open standards. After all, “If you’re not going to give your data to me, let it be because you decided to, not because you can’t.”

This article first appeared on

Where is the Semantic Web Killer App? (Part 2)

Dan Grigorovici Contributor

(Editor’s note: This is Part 2 of Dan’s column, “Where is the Semantic Web Killer App?” In Part 1, he analyzed the Semantic Web’s credibility problem, and he urged the semweb community to answer some questions, including: Find out the business problem — and also the consumer problem — the Semantic Web is solving. We pick up his top 5 issues at No. 3.)

3. Build your strategy and business team for your Semantic Web startup

No matter how strong your technology is, solving the problems I discussed yesterday that are plaguing the discourse in Semantic Web startups in the consumer space today, we cannot do without strategy and business.

To comment on this article, go to the Datamation Blog.

We need to let business people drive the Semantic ship, as they are in the best position to answer the simple questions I mentioned yesterday. In my experience — as I am sure in any corporate technology project anyone has experienced — successful technology products have been successful precisely because they were started with the end goal in mind (business preparing the requirements after identifying the problem in need of a solution), rather than the other way around. Having a healthy dose of business reading (Jim Collins’ “Built to Last” comes to mind) will help in realizing that a visionary technology is nothing without being embedded with business vision.

It seems to me that the only reason why our community is so keen on evangelizing Semantic Web (other than the fact that we’ve spent at least the last seven years working on it) to everyone willing to hear, is that (or at least should be) that we believe in its benefits. But again: Its benefits have nothing to do with technology itself; its benefits are about making the use of information to consumers and businesses, better, or easier. Interesting fact though: Whenever I ask a Semantic Web preacher, “would you support a different technology that solves the same problem faster, with the same effectiveness, etc.?,” I tend to get answers such as “we need Semantic Web.” And this to me is alarmingly sounding like technological totalitarianism, probably because of our need to validate those seven years + we’ve spent trying to deliver the Semantic Web vision “since 2001″.

So my question is: How do we address the consumer/user when the issue of which particular technology delivers the same end result is not really important as long as its needs are met? I suspect that only our business team will be able to show us the path in solving.

4. Don’t prove the obvious; don’t fix what’s working

A few people I talked to in the industry mentioned there is no value in improving what is already working. In other words, if the output of our Semantic Web projects will be a much better organized, really inter-connected, data web — essentially delivering similar information to what is being delivered today by non-semantic applications — we have spent our time trying to (im)prove the obvious, or fix what was already working.

The promise of Semantic Web is that not as much that it delivers more relevant information to us, or that it opens up the Web to querying it like a database, especially if these are about what essentially could be delivered by a non-semantic search engine, or our incessant love for browsing loads of pages (which, by the way, has the advantage of making money for businesses today); it should be that it opens up NEW information that was not available/accessible before; that it’s “smarter,” and I now don’t have to spend hours still not being able to find out “the relationship between the price of tea in China and the Kansas hurricane of 1978,” or “show me the list of pop singers whose last name is Johnson who are NOT members of band x.” If all we are improving on while building the Semantic Web killer apps is the “guts” of the Internet which will essentially deliver similar, but only slightly better information, we need not bother spending time on these. Instead, we should prioritize our development (both technical and business) by focusing on delivery of what we could not have delivered before without it. The reason I am saying this is because a lot of Semantic startups are really delivering the same information as before, only more effectively (while costlier, due to schema “complexity tax,” Nick Lothian might argue). Let us take a second in studying what “more effectively” here means.

Read more

Jump In, the Water’s Fine

Jennifer Zaino Contributor

NEW YORK — It’s a blue ocean out there, with tons of potential. That’s what Ian Davis, CTO of semantic web platform provider Talis, told an audience gathered today at the LinkedData Planet Conference here.

Learn how the Semantic Web is changing the way we treat data at the LinkedData Planet Conference. Sir Tim Berners-Lee, inventor of the World Wide Web and director of the W3C, is among the event’s keynote speakers.

The semantic web, with the possibilities it offers for decreasing the costs of integrating, using and publishing data, while increasing the value derived from publishing it, has big implications for companies, Davis said. It means that there is a wide vista of potential opportunities that can radically change the way enterprises do business and people interact on a social basis.

“There are more ways to link data, to use information and share it among and in a single organization,” said Davis. “What that tends to mean is you get a blue ocean, a metaphor that means the market doesn’t exist yet.”

Not a bad sea to swim in, in contrast to the red ocean, which Davis said is the province of most established businesses today.

“They’re all competing fiercely, the rules of the game known … it’s where the fighting happens. In the blue ocean, we don’t know what will happen. We don’t know how to price things, or what the applications will be, but there is enough room to try things out. Right now we are at the start of a new market.”

His advice to enterprises — many of them often slow to move forward on a trend that asks them to open up their data so that they can exploit these new opportunities — is to start figuring out how they might play in this new space. Because just a decade from now, the new space will be THE space.

“Ten years from now we won’t be thinking of the semantic web, it will just be the web. But a smarter web that has more information,” he said. Think, for example, that just 10 years ago organizations were still arguing about why they even needed a web site, and now no one gives a second thought to this as a means for sharing information and letting more people interact with you.

“So my advice to you here is we’ve got to take the first steps, understand how to go about publishing this data.” A good start is working at how you can innovate around sharing some of the data internally between silos before opening it more broadly.

The open network, Davis said, has always outpaced the closed one (see the old AOL and CompuServe model vs. the open web), because of its ability to continually generate more value as it attracts more participants.

Read more

Keep it Simple, Stupid

Jennifer Zaino Contributor

NEW YORK — Web 2.0 is not the enemy. Many of Web 2.0′s cornerstones are exactly what the semantic web needs, Uche Ogbuji, a partner at Zepheira, told an audience today at the LinkedData Planet Conference here.

Web 2.0′s underlying principles are simple to grasp by webmasters, as are its end results of helping them boost their search rankings. At the same time, Web 2.0 has opened some doors to the dark side (spam, malware) that can be addressed by semantic web concepts.

Learn how the Semantic Web is changing the way we treat data at the LinkedData Planet Conference. Sir Tim Berners-Lee, inventor of the World Wide Web and director of the W3C, is among the event’s keynote speakers.

“Is this a catastrophic distraction from what the web needs? Maybe not,” Ogbuji said. “Web 2.0 is about thinking globally, and acting locally, like the hippies used to say.”

That is, you just worry about making your web site connect to others, and cool things will happen at a global level — mash-ups, feeds, user-contributed content, all fueling your Google rankings. “The point is lot of people could appreciate that message, because it was simple, well-compartmentalized, and understandable,” Ogbuji said. “Follow these suggestions and your web site becomes part of something bigger, better, so much better connected to Google — and that’s what everyone really cares about, right?”

Proponents of the semantic web should dispense with attempting to market it as Web 3.0 and with trying to go too big in concept for the average Webmaster to readily grasp.

“We’re not loud enough, for one thing,” he says of the community. But more importantly, we won’t get to a better web unless you can break things down so that people can figure out what you mean and what they’re supposed to do in ten minutes. The message should be about encouraging webmasters to do a few neat things that are less sloppy on their websites, and “all sorts of global goodness will come from it. Think globally, act locally and we have a better web,” he said.

For example, he says, people don’t have to become OWL (Web Ontology Language) experts overnight — it’s good enough to take the baby steps of using Atom 1.0 to help capture some semantics that HTML doesn’t, such as who created a web page.

“I say we’re not different from Web 2.0, we’re just Web 2.0 done right,” Ogbuji said, in being vendor-independent, scalable, and multi-device friendly, for example, as well as making it easier for data to be indexed and more likely that search engines will find it. Also, as spammers, malware writers, and other thugs have caught up with Web 2.0, it may be time to bring back the semantic web’s idea of the web of trust.

Toward these ends, Ogbuji recommends building on Sir Tim Berners-Lee’s basics of what makes a better web: use URIs (Uniform Resource Identifier) that stand for something, use HTTP URIs so people can look up those names, provide useful information when people look up those URIs, and include links to other URIs so people can discover more things. “This is stuff that people can wrap their heads around,” he says. “Joe Webmaster can understand this, and then those people who work on tools and techniques can use those openings to give people more clever things they can do to make the web better.”

Linking open data is the real Web 2.0, Ogbuji contends, and getting there will require building and refining the basics outlined by Berners-Lee, which is the project of the Linking Open Data (LOD) community initiative. It also helps to have lots of example sites that help people understand how they can get involved in linking data, with DBpedia probably being the best known of these.

“By doing that, you build this linked data basis, and you make it bigger until hopefully it becomes enough of a draw that everybody tries to join in that effort to make the web better.”

Where is the Semantic Web Killer App? (Part 1)

Dan Grigorovici Contributor

NEW YORK — Recently, I talked to a lot of key VC principals, and they confirmed what I have been suspecting for a while: the Semantic Web (or “Linked Data,” “dataweb,” etc.) has a credibility problem, to the point of being suggested that it’s better to avoid “artificial intelligence,” “semantic” keywords in executive summaries, for fear of hitting a disbelief (“the Semantic Web is the future of the web and it will always be” proudly since 2001).

To comment on this article, go to the Datamation Blog.

Coming from me — a deep believer in a “smarter” web as the future of the Internet and having worked with, and working on a related startup — this is a pretty big deal, but I think we need the community to focus on the business problem and face the issues in order to solve them. I have thus decided to make this the first topic of what will become a series of posts, because I think the SemWeb community needs to face the challenges coming from the business crowd and address them. I am hoping this will spur a serious debate and work on solving them.

I am a “data geek,” a Semantic Web entrepreneur, technologist, and believer. But I don’t think solving the Semantic Web credibility issue and designing its “killer app” and consumer success has anything to do with technology. I argue here that it’s the lack of business focus and basic ability to answer some simple questions (which non-semantic apps can) that are at the core of the continued lack of realization of even the smallest sign of a killer Semantic Web app.

There are some great startups (Twine, Adaptive Blue, Freebase, Zitgist, and others) that rose in the recent years, surely; there are some technological reasons why adoption has been slow, surely. But at its core, I don’t think solving the problem and delivering the vision of a “smarter” web has much to do with the web site owners’ laziness in adopting a standard, or the scalability of triple stores, or lack of technical expertise from the part of consumers, clients, funders, etc.

The point is: After more than seven years of promising the Semantic Web deliverance, we still can’t get our one-pagers clear. We still can’t explain our proposition to users, funders, or anyone else outside the community, what we are building and how it is better than the “dumb” (but increasingly crowd-telligent) and newspaper-ish web of today. That is, we can’t do any of the above without needing to dive into a long exhortation into the meanders of technological detail, or without advocating the need to build vertical knowledge bases from the ground up. At which point we lose both the user, and the funder. Not to mention the political factions that exists in the community today between the practical and the purists (more about this in my next post).

One of the best and most succinct presentations of some of the issues comes from Nick Lothian, commenting on Peter Norvig, author of one of the best AI textbooks and director of research at Google. The other commentary I have come across, much more focused on technology issues, is, “The 7 (f)laws of the Semantic Web.”

Let me explain what the issue is by linking my personal experience and thinking, in detail. I believe the Semantic Web community sounds a lot like a solution in search of a problem. It is only natural to be so, since the work has been done mostly by technologists, but this has been hampering the ability to generate the “killer app.” How so, you will ask, in disbelief? Here are my answers:

“Where’s the (business) beef?”

From the point of view of technology, we are almost there: having arisen from the academic community and to a large extent (except implementations in corporate projects) limited to (still) academic projects, Semantic Web projects don’t lack a host of technological implementation choices. What we do lack is a consistent business team in every one of these projects. Please enter your content here.

Read more