Archives: March 2008

Introducing the Semantic Web Gang Podcast

Paul Miller Contributor

Gang members for the first show are:

  • Greg Boutin
  • Mills Davis of Project 10X
  • Tom Heath of Talis
  • Alex Iskold of AdaptiveBlue
  • Daniel Lewis of OpenLink
  • Thomas Tague of Reuters

  • We shall be adding to the gang in the coming months, as well as introducing the occasional special guest from time to time. Check back for next month’s podcast.

    Listen Now

    This conversation was conducted on March 20. For further Talking with Talis podcasts on the emerging Web of Data, click here.

    At Talis, Paul Miller is active in raising awareness of new trends and possibilities arising from wider adoption of the Semantic Web.

    My Semantic Manifesto

    I've been talking with my son this week about the importance of words and what they mean. He is studying the Constitution, the Bill of Rights and the Articles of Confederation in his 5th grade history class – he was so inspired by reading those documents that he asked me to take him to the Library of Congress on our trip to Virginia next week. I was suitably impressed and was also reminded of the relationship between concepts and constructs.

    Read more

    To Market, To Market, The Semantic Way

    Jennifer Zaino Contributor

    The arena that today’s chief marketing officer inhabits is very different from what it was 10 years ago, and semantic web technologies are only going to change things even more.

    From enterprise marketing management tools to search engine optimization, marketers have had to become conversant with the technologies that will enable them to achieve their branding, campaign, or lead-generation objectives, though they needn’t actually aim at becoming technologists themselves.

    “The channels in which they are operating tend to favor the advantage to those who understand the technology,” says Scott Brinker, the president and CTO of Ion Interactive, a company that delivers post-click marketing software and services to test and analyze web site visitors’ landing experiences. Brinker writes the blog Chief Marketing Technologist. “So if you understand how to do really good search engine optimization, your company ends up with higher placement, or more traffic, or more customers.”

    Yahoo’s recent announcement that it is embracing semantic web standards — especially if other leaders in the search world follow suit — now is providing marketers with a clear and immediate incentive to leverage semantic web technologies in their web properties to gain a clear advantage over competitors.

    “Search engines, in my opinion, have been the primary driver of Internet marketing taking off to the degree it has,” says Brinker. “Whole businesses and agencies are focused on that aspect of helping companies represent themselves in search engine results. But what they are doing today is uni-dimensional or two-dimensional at best.”

    They’ve been very creative about exploiting the very narrow band of things they can influence with the 95 characters of text they typically get for the headline, blurb, and URL that show up in search engine results pages, but “the Yahoo announcement, to me, represents a chance to add several dimensional layers on top of that,” Brinker says.

    In a world in which skeptical consumers closely suss out search page results to try to determine whether a site really meets their specific interests or they just happened to get caught in the wide net thrown out by someone, those companies that can start exposing semantic data that is clearly very relevant and targeted to a particular searcher will gain the advantage, Brinker says.

    “Even if you assume that initially [adding semantic metadata] doesn’t impact where [something] appears in the results, the reality is that in the results data that comes across, one that represents quality semantic information is going to be more compelling than that sort of random mash-up of text that usually appears” as a link synopsis, he says. Plus, as more people visit and link back to that site, it creates more inbound links and naturally starts to raise the page’s position in the search engines, he says.

    In his blog this week, Brinker discusses these issues, even coining a next-wave term for search optimization — SEO ++, in a nod to the past paradigm shift from the C programming language to C++. And make no doubt, it is a paradigm shift, he tells At the Search Engine Strategies show in NY last week, he says, you could have heard a pin drop when Yahoo! Search’s chief scientist Andrew Tomkins gave a keynote that described its plans for the Open Search platform.

    Read more

    LinkedIn’s Semantic Technology Initiatives

    I recently spoke with Steve Ganz, LinkedIn’s Principal Web Developer, about the role of semantic technologies in LinkedIn’s strategy.

    SR: Steve, I’ve seen some press releases recently about how LinkedIn is using semantic technologies as part of its strategy. When did this start?

    Ganz: LinkedIn has been using some aspects of semantic technology since 2006. Mainly we use a variety of microformats in our web presence. In fact, LinkedIn is the largest publisher of the hresume format.

    SR: Do you distinguish between front end and back end uses of semantics?

    Ganz: Most of what we use is what I categorize as “lower-case” semantic web, which is different from the “upper-case” Semantic Web technologies. Our uses are almost entirely based on the use of microformats in the front end. We use them to give the user access to their own data in ways beyond simply viewing it on the website. I like to think of microformats as the “format for the people.” They are easy to implement by applying meaningful class names to well structured, semantic HTML which is aready marking up existing published human readable data.

    SR: When I was on my LinkedIn account the other day, I noticed a section titled “People you may know.” As I was thinking about semantic technologies, I made a guess that there was some kind of semantic analysis going on in the background to enable this. Is that the case?

    Ganz: We don’t actually use any semantic structures in our data that I’m aware of. The “People you may know” section is created by using sophisticated algorithims on traditional data.

    Our concentration is on making the users data easily available and useful. To that end, we use a variety of microformats including hCard, hResume, hEvent, and hReview. We are also implementing the XFN microformat (XHTML friends network), which is one of the cornerstones of the Google Social Graph API.

    This works by applying the “rel” attribute with a value of “me” to links to your other websites, then the Google spiders go out and discover all of your related sites. Our public profile is a good starting point for that as we move toward exploring OpenID and OAuth for digital identities.

    SR: How do you see the adoption of microformats in general?

    Ganz: Microformats are gaining in adoption from the publishing standpoint. It’s practically ubiquitous compared to some of the upper-case semantic technologies. Yahoo! is a huge publisher of microformats, and Google now publishes geo locations in microformats. It’s definitely gaining popularity.
    I think this is because using microformats doesn’t require building any back end infrastructure. Rather, it’s easy to implement for the front end web developer, and they will naturally gravitate to it.

    SR: Do you see what you refer to as the upper-case and lower-case semantic communities coming together?

    Ganz: There are definitely benefits to having the two communities work more closely. It hasn’t always been that way, but I sense a bit more of a cooperative spirit now.

    SR: What’s next as far as LinkedIn’s use of semantics?

    Ganz: We will hopefully be experimenting with RDF and FoaF in the coming months. As the technology becomes fully realized, I’m sure that the Semantic Web will be every bit as profound a development as the WWW was.

    W3C Springs Into Spring

    Jennifer Zaino Contributor

    Last week was a busy one on the semantic web front for the World Wide Web Consortium (W3C).

    The Protocol for Web Description Resources (POWDER) Working Group published an updated Working Draft of Protocol for Web Description Resources (POWDER): Description Resources. The working group was chartered about a year ago to produce recommendations covering how groups of resources can be described and how the origin of those descriptions can be authenticated. Such resources can be described through the publication of machine-readable metadata, encapsulated by Description Resources (DRs).

    According to the W3C, the latest draft describes how DRs can be created and published, how to link to DRs from other online resources, and how DRs may be authenticated and trusted. The aim is to provide a platform through which opinions, claims and assertions about online resources can be expressed by people and exchanged by machines, the W3 says.

    There are some interesting motivators driving the development of POWDER. These use cases include:

  • Having an optimal web experience on a mobile phone, by showing search results in which the metadata associated with the links presented indicate conformance with mobileOK, a trustmark that can be applied to online content that meets criteria derived from the Mobile Web Best Practices;

  • Improving web functionality — for example, redirecting users on slow connections to a page of video clips that are more appropriate to stream at a lower bandwidth;

  • Improving the web experience for people with disabilities, via web search engines that can gather metadata about sites including their conformance to the Web Content Accessibility Guidelines, so that search results can be prioritized according to compliance with particular checkpoints of those guidelines; and even

  • Child protection. A network operator could, for example, have access to a metadata description from a web portal that indicates a particular site is associated with adult nudity. Should a child be sent a link to this site — and his parents have activated a feature from the network operator that lets them restrict the kind of content their children can view — the network operator will be able to block the child’s access to the site.

  • Also last week, the Semantic Web Deployment Working Group and the XHTML2 Working Group jointly published an updated Working Draft of the RDFa Primer 1.0. RDFa lets XHTML authors express the structured data in their web pages — calendar events, contact information, and so on — using existing XHTML attributes and a handful of new ones, paving the way for users to transfer structured data between applications and web sites. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don’t need to repeat significant data in the document content, the working group says.

    Earlier this month, the W3C had some additional news on the RDF front. It announced the W3C RDB2RDF Incubator Group, sponsored by W3C Members Oracle, HP, PartnersHealthcare, and OpenLink Software. The group’s goals, it says, are to to examine and classify existing approaches to mapping relational data into RDF and assess whether standardization is possible and/or necessary in this area, and to examine and classify existing approaches to mapping OWL classes to relational data, or, more accurately, SQL queries, moving towards the goal of defining a standard in this area.

    Got a live one

    On my hunt to find real-life implementors of semantic technologies,
    preferably ones who have finished projects that impact lots of
    non-computer people, I’ve now had an extensive interview with a group
    that fits that bill.

    Read more

    Knowledge Management: From the Personal to the Enterprise

    Jennifer Zaino Contributor

    Cognitive psychologist Lars Ludwig has developed an application
    dubbed ArtificialMemory as a test bed for his interests in experimenting with
    knowledge management techniques and approaches.

    Currently in use with a
    couple of companies, the application may be the semantic equivalent of
    online analytical processing (OLAP), the simplified approach taken by the
    business intelligence community to quickly get answers to multi-dimensional
    questions. The system aims at creating a very easy way to access and aggregate semantic information — doing what an OWL engineer would call inferencing — so that individual and group knowledge can be set free from documents and the applications that produce them, in order to be reused as needed. ArtificialMemory supports semantic web standards such as RDF and OWL, while taking a very different approach to the idea of inferencing.

    “In order to get rid of documents, you have to break the content into pieces and rearrange it in a useful way,” says Ludwig. “For that you need a kind of semantic framework, and that’s the motivation behind ArtificialMemory, to provide this semantic framework in a fashion that lets you really work fast. You don’t have to think about a data model, really. What you do is kind of either instantiate objects that you already know or just describe what you are noting down.”

    Users of ArtificialMemory explicate relationships — something is a book or a person, for example — to create a very simple form of classification that lets knowledge easily be reused later, vs. what he says is the more complex semantic web approach of using OWL ontologies to infer knowledge based on what an instance is and to which classification it belongs.

    “The problem is that the semantic applications, as far as I understand them and see them, were originally a movement to make the web as we know it machine-readable,” he says. “The problem is that, actually, a fact being extracted from the web or documents–facts and annotations are not really readable to humans. So there is missing an important part here and that is readability for humans.” RDF, XML, and company are good for machines, “but in the end if you step away from automatic fact generation you are running into a problem. You have to get users to annotate facts, statements, and this is something that has not really been thought through by the scientific community as far as I can see.”

    Read more

    Semantic Web at DAMA

    Wilshire conferences is putting on the DAMA International Symposium and
    MetaData Conference this week here in San Diego. A lot of the issues
    that data management folks care about (master data management, data
    federation, identity management) are also dealt with by the Semantic

    Read more

    Summering on the Semantic Web

    Jennifer Zaino Contributor

    In July, the European Summer School on Ontological Engineering and the Semantic Web will host its sixth annual session. Last week, registration opened for those who would like to be among the 50 post-graduate students that participate annually — and who likely will become part of a cohort of the next generation of researchers who will push work in the area forward.

    Sponsored by a number of European projects, including LUISA, NEON, SUPER and X-MEDIA, and also by STI International, tutors and keynote speakers include a veritable who’s who of semantic web thought leaders. Tutors include John Domingue of The Open University in the UK, who is spearheading the organizing committee, and Sean Bechhofer of the U.K.’s University of Manchester, and among the keynote speakers are Rensselaer Polytechnic Institute’s Jim Hendler, and Mark Greaves of Vulcan Inc.

    The week-long session takes place in Cercedilla, a small village in the mountains near Madrid. The approach is based on tutorials and hands-on practical workshops around developing Ontologies and Semantic Web applications, all linked to a mini-project that results from cooperation among the participants. Bechhofer, who is now involved with the project for the fourth year, says the kind of collaboration and team-work that is fostered in the intimate and intense program matters to the future of semantic technology.

    “We see the groups that got together in projects there, and as you go on to subsequent meetings, conferences or workshops, you see those students are still together, interacting, form collaborations, cooperating, visiting, and helping to build and weave together a community, which is very important,” he says. “It’s really a useful and nice thing to see the fact that the summer school helps to forge those links.”

    Publications and workshop papers have come out of collaborations that originally started at the summer school, he notes. This year, Bechhofer expects to give a theoretical session on looking at the rationale behind OWL, including its importance to the development of the semantic web and is technical details (ontology matching, ontological engineering, conceptual design, and so on), which will be followed up with a practical session to explore those ideas looking at some sample models.

    Other instructors have agreed about how the program is helping develop the next generation of PhD candidates.

    “During the last days the students have to work in teams on mini-projects to come up with interesting applications, bringing up research questions and to get a glimpse on how to work in international collaborative teams,” wrote Stephen Baumann of the Competence Center Computational Culture, German Research Center for AI, in a blog posting a couple of years ago. He presented at the fourth version of the summer school on the topic “Towards a Social Web?!” He called the event a “must” for those PhD students in the early stages of their work, as well as their supervisors.

    Yahoo Takes to the Semantic Web

    Jennifer Zaino Contributor

    One of the giants of search, Yahoo Inc, has announced that it will be embracing a number of semantic web standards.

    Support for the standards will be built into the new Yahoo Search open platform, according to Amit Kumar’s entry in the Yahoo Search blog on Thursday. Kumar is the director of product management at Yahoo Search.

    “In the coming weeks, we’ll be releasing more detailed specifications that will describe our support of semantic web standards,” he writes.

    “Initially, we plan to support a number of microformats, including hCard, hCalendar, hReview, hAtom, and XFN. Yahoo Search will work with the web community to evolve the vocabulary framework for embedding structured data. For starters, we plan to support vocabulary components from Dublin Core, Creative Commons, FOAF, GEORss, MediaRSS, and others based on feedback. And, we will support RDFa and eRDF markup to embed these into existing HTML pages. Finally, we are announcing support for the OpenSearch specification, with extensions for structured queries to deep web data sources.”

    OpenSearch is a collection of simple formats for the sharing of search results that can be used to help people discover and use a search engine and to syndicate search results across the web. The search giant had developed an extremely effective way of searching for pages on the internet, Tim Berners-Lee said, but that ability paled in comparison to what could be achieved on the “web of the future.”

    As part of the announcement, Yahoo says it will be opening up its new Yahoo Search platform to third party developers, with plans to launch a beta program for a development tool to build Enhanced Results applications for the platform in a few weeks. “Enhanced Results apps built by developers can utilize the structured data available through public APIs and in our index (made available by site owners through either feeds or the semantic web standards discussed above),” he writes.

    The news of Yahoo’s support for semantic web standards comes at the tail end of a week in which world wide web inventor Tim Berners-Lee said that its search rival Google could be superseded as the pre-eminent brand on the internet by a company that harnesses the power of next-generation web technology. According to an interview that Berners-Lee gave to Jonathan Richards of the U.K.’es Times Online, the search giant had developed an extremely effective way of searching for pages on the internet. But, he told Richards, that ability paled in comparison to what could be achieved on the web of the future, which will enable direct connectivity between much more low-level pieces of information, which in turn will give rise to new services.

    This development at Yahoo is separate from but related to its work in microsearch. Microsearch has been the testing ground for some of the ideas around semantic search, and most likely will continue to be, according to Peter Mika, a researcher who’s working in that area at Yahoo Research Barcelona. Microsearch is a research prototype, while SearchMonkey (the code name for Yahoo’s open search platform) is soon to be released as a product designed to handle a large number of users and a virtually unlimited amount of metadata, he writes in an email. There are a number of features in microsearch that won’t appear in SearchMonkey yet. Nevertheless, he says, the important news here is that Yahoo can now stand in front of the world and announce its support for Semantic Web standards in a product for which it has very high expectations.

    “Personally,” he writes, “I believe that SearchMonkey will unleash a completely new wave of innovation in the Semantic Web domain by putting the entire world’s metadata at the fingertips of developers. And that is quite something!”

    NEXT PAGE >>