Archives: August 2008

Q & A with Semantic Guru Patrick Carmichael

Jennifer Zaino Contributor

What’s happening inside the Semantic Technologies for the Enhancement of Case-Based Learning, the big U.K. project to enhance 21st century learning we reported on earlier this summer? To find out, recently caught up with Dr. Patrick Carmichael, project director and head of the Evaluation Group at CARET, the Centre for Applied Research in Educational Technologies at Cambridge University. How did CARET come to be involved in this project?

Carmichael: CARET’s mission is to support teaching, learning, and researching across the university, but also to take on broader projects like this — externally funded research. Historically we have tended to do broadly educational technology projects concerned with developing virtual learning environments. But I also head up the unit concerned with developing teaching and learning with and without technology, and some of the projects have very minimal tech components — developing approaches to teaching more generally, for example.

So, one of the interesting things is that we have people who have an educational and social sciences background, and people who are software developers and content developers under the same roof. This new project is interesting because it really is specifically designed to be interdisciplinary, where the technologists and social scientists are working and learning with and from each other. That is central to the project.

And we’ve had a number of projects in which you can probably see the genesis of this one, such as digital repositories that have supported a lot of the big digitization projects at Cambridge in the university library and museums, for long-term digital preservation. One of the key applications we’ve identified there is the Fedora digital repository, which is very well set up to be adaptable for semantic web applications.

Also, we’ve had a number of teaching and learning projects to do with teaching complex or rapidly-changing or controversial issues — we have teaching staff very interested in how you prepare undergraduate and post-graduate students to deal with complexity through problem and case-based learning. That’s where the project emerged for us, from a dual interest in developing robust and scalable architectures and doing really high level and effective teaching around complex issues. The semantic web fits in as way of supporting both our technology concerns and those teaching and learning issues. What are the specific roles for your group?

Carmichael: There are two specific roles for our group. One is that this is where the actual technology development is going to take place. We will be drawing on our prior experience with digital repositories, linking with the MIT team [that developed SIMILE (Semantic Interoperability of Metadata and Information In unLike Environments)], and also we have a very close working relationship with the Economic and Social Data Service, which is the electronic digital repository for social sciences in the U.K. They are the long-term guardians and brokers of social science data from research projects in the UK. So Cambridge is at the center of the group of technology developers and service providers, within the project

Read more

The Here & Now of Semantic Integration

I’ve read some articles again this week explaining that the promise or
potential of Semantic Integration is still a bit far off from being
able to engender significant impact to the enterprise. Perhaps, it is
worth examining the subject more as a practice approach rather than a
standards or technology dependent facilitation medium.

Read more

Semantics & Governance

Someone asked me recently whether or not there some type of tangible relationship between Semantic Integration and governance – it was an excellent question and the answer is a resounding "Yes."

Read more

Cognition’s Free Service Cuts Through Health Info Clutter

Jennifer Zaino Contributor

This summer, semantic web technology company Cognition launched, a free service that uses its SemanticNLP natural language processing technology to help health care professionals, researchers, and consumers quickly discover complex health and life science material from within the 18-million-article abstract database of health information published by the National Library of Medicine.

“When it comes to SemanticMEDLINE, the number of cases where the terminology actually used in MEDLINE content is so very ambiguous that if you were to type in a plain English phrase for what you’re looking for, you’d get not only perhaps the right information, but thousands of others that have nothing to do with that,” says Cognition CEO Scott Jarus. “By using our technology you can more precisely hit the target and throw out irrelevant stuff.”

For example, normally a non-semantic query for “brain lobes affected by herpes encephalitis” is likely to result in volumes of data around herpes and encephalitis, without reasoning to the specific class of terms in question. At the same time, the semantic technology behind the service also ensures that individuals researching “heart attack” also can be surfaced information on myocardial infarction — a phrase they may not have searched for because they didn’t know that it was a synonym.

Cognition developed the service as an example of its semantic natural language processing technology, some 20 years in the making. It’s the latest to a set of free semantic search services that also include CaseLaw, a database of federal and supreme court decisions since 1950 from, and Wiki, for semantically searching the English Wikipedia. Cognition isn’t trying to be a Google killer, though. Its business model actually is to semantically enable other organizations’ technologies or applications.

“So, if you have content, and you want a semantic search capability, we can help,” says Jarus.

Customers include legal clients, who have applied its semantic natural language processing technologies to the ediscovery process. In fact, Cognition provides the advanced semantic search in LexisNexis Concordance. It also provides enterprise semantic search for large corporations in the biotech field, helping them organize their data semantically.

Read more

Shining a Light on Reuters’ Spotlight

Jennifer Zaino Contributor

For the last few months, the Thomson Reuters “Reuters Labs” initiative has been quietly moving forward with Spotlight, an open developer community that provides non-commercial developers with free access to Reuters multimedia articles, pictures, videos and text news via a set of API services.

Now, it is bringing the community out into the, well, spotlight, with news of some applications and mashups on the experimental developer initiative, which features integration with the Open Calais web service, which automatically creates rich semantic metadata.

Basically, the service works like this. Developers sign up for the service and get an API key to access Reuters’ rich news content or metadata about the content via standards-based feeds. You can build content feed URLs or metadata feed URLs using a number of parameters that determine the format of the feed, including country from which the content is sourced and format (Atom, RSS, Media RSS, JSON and Serialized PHP, etc.). Optionally, developers can select to have a semantically marked version of the content in an RDF format, which uses the Open Calais service to automatically extract entities from within the content and give meaning and relationships in the form of metadata to the content.

Andrew Lister, head of labs development, says about 500 developers have signed up so far. The gallery of applications is small so far — under ten — but he expects that to grow as developers learn that’s a way to generate interest in their work. Among the existing applications are GIST, created by Thomson Reuters developer Todd Faulls using Spotlight and Calais. It offers a visual display of news filtered by people, places and things that combines related stories, images and video clips. The integration with Calais creates an understanding of popular items (for instance, a lot of Obama or Britney Spears tagging) and enables prioritization and navigation of related stories to give the reader a fuller view of a story and the trends around it.

Another service is a Spotlight/Daylife/Calais mashup that runs on a touch screen device that explores the forms of presenting news content to help readers select and process news; it is predicated on the idea that a design system has to be able to comprehend content in some degree to appropriately present it, and uses natural language processing to provide insight into each new item’s specific content and structure.

“News is a fascinating piece of content. It changes all the time. People and places are always moving,” Lister says. “I think news is a really good piece of content to try things out with the semantic web.” Its very un-static nature challenges the ability to continually create, refine, and relate data sources, and build upon that intelligence.

He says he sees a lot of innovation happening with Gist-like applications. “If a lot of people have much the same ideas at the same time, then probably you going to end up with quite a good idea,” he says. “People are quite interested by what they can do with the semantic web and what extras it gives them.”

Spotlight, Lister emphasizes, isn’t just about putting news up on a blog site — there are widgits for that. “This,” he says, “is for people who want to build some ideas.”

Semantic Integration & Enterprise Architecture

In many ways, Enterprise Architecture (EA) is as misunderstood as Semantics. Although EA has been practiced across a much wider community of IT professionals for a longer period of time, it still suffers from an identity crisis. Is EA the mandatory precursor for model driven development, or is it part of a bigger picture and if so, what is that picture?

Read more

The Semantic Web’s Role in Dealing with Disasters

Jennifer Zaino Contributor

The University of Southern California Information Sciences Institute and Childrens Hospital Los Angeles have been working together to build a software tool. Dubbed PEDSS (Pediatric Emergency Decision Support System), the tool is designed to help medical service providers more effectively plan for, train for, and respond to serious incidents and disasters affecting children.

The project, a part of the Pediatric Disaster Resource and Training Center (PDRTC), has been going on for about eight months.

Dr. Tatyana Ryutov, a research scientist at the USC Information Sciences Institute, is working on the system. Recently, the Institute contacted Joshua Tauberer, the creator of and the man who maintains a large RDF (Resource Description Framework) data set of U.S. Census data, about making SPAQRL queries to that data in conjunction with the PEDSS.

“PEDSS helps hospital disaster-response personnel produce and maintain disaster response plans that apply best practice pediatric recommendations to their particular local conditions and requirements,” Dr. Ryutov wrote in an email to Specific data on community demographics, geography, and neighboring industries — as well as on-site staff, facility size and available supplies –could enable the software to tailor a plan to fit each user institution’s unique preparedness resources and vulnerabilities. According to the PDRTC web site, the software would collate the data; produce a list of disaster scenarios most-likely to impact the user; and outline the training, supplies and auxiliary services that must be put in place in order to develop a viable pediatric preparedness plan.

“PEDSS guides users at a facility through collecting and preparing critical information needed for preparing a plan, in the spirit of tax preparation software like TurboTax,” she writes. But the biggest technical problem is collecting and representing knowledge. That’s where Tauberer’s census data set could come in handy.

“Currently, demographic data (number of children in four age groups) is entered manually. We want the tool to calculate this information automatically based on a zip-code. Therefore, we extend the tool to query the RDF census data server to get this information,” Ryutov writes. Currently this is the only server the software queries, but Ryutov says they plan to add calls to other census data servers to improve reliability. Those servers do not have to be RDF databases.

Whether or not the information comes in RDF formats, it’s pretty critical for the project to succeed. According to the PDRTC web site, the city’s dense population, industry and high international profile, Los Angeles is at increased risk for pandemic influenza; environmental accident; biochemical incident; and terrorist attacks.

“Yet, as of today, health care providers are not prepared to respond when the next crisis occurs. In fact, research indicates that few hospitals throughout the 4,000 square miles of L.A. County have a tested disaster plan to adequately address the community’s needs during a widespread emergency,” the site notes. It adds that, “Although 2.8 million youngsters live in L.A. County, a recent survey suggests that less than 25 percent of the region’s hospitals and public health emergency agencies have written disaster plans that address the particular needs of children.”

Podcast: Kingsley Idehen on OpenLink Software, Linked Data

Paul Miller Contributor

In this version of the Semantic Web Gang podcast, Paul Miller talks with Kingsley Idehen, president and CEO of OpenLink Software. They discuss OpenLink’s approach, and the role that semantic technologies play in this, before turning to a broader discussion of Linked Data.

We shall be adding to the Semantic Web Gang podcasts in the coming months, as well as introducing the occasional special guest from time to time. Check back for next month’s podcast.

Listen Now

For further Talking with Talis podcasts on the emerging Web of Data, click here.

At Talis, Paul Miller is active in raising awareness of new trends and possibilities arising from wider adoption of the Semantic Web.

Ardorado’s Passion is the Semantic Web

Jennifer Zaino Contributor

Start-up Semantic Communities LLC has launched a beta of, a portal of sorts dedicated to human passions, or “ardors,” ranging from music to Monty Python, camping to comics.

The stated mission of Semantic Communities is to bring the benefits of the Semantic Web and Web 3.0 technologies to everybody, putting semantic web tools, technologies and standards — MediaWiki, Virtuoso triple store, SPARQL, RDF, FOAF, and PHP–to good use by ordinary people and businesses.

According to Emil Freund, president and CEO of Semantic Communities, is part mash-up, part social network, and part directory, and its goal is to help passionate like-minded people looking to meet and interact with others as passionate about a topic as themselves as well as read news and find the best sources of information or experts on these topics, while at the same time providing a platform through which businesses, professional service providers, and serious hobbyists can directly engage with the audience that is interested in their message. It offers a semantic profile for its members, featured or registered bloggers, site owners, and online communities; the ability to associate a blog or a site with one or several related interest; and semantic representations of interests, relationships between interests, relationships between members and between sites, blogs, on-line communities and their associated interests. recently spoke with Freund in more detail about the portal. Following are excerpts of that conversation and a follow-up discussion in email. Other “semantic-driven” sites are designed to help individuals follow their passions and interests. What makes Ardorado different?

Freund: Social networking is a hot topic. But to keep up with your passions you have to sign on to multiple networks. So why not create a place where we can bring interests together into one place and then create connections between interests. Every social network asks what your interest is, but they do very little with that. At the same time we [the founders] were all reading about the semantic web, and we thought it was very powerful. So why not recast social networks and how people communicate to each other and take these closed networks and push the social network around our interests on all these standards. So we built a huge interest database — we found lots of valuable information [about interests] in sources such as Facebook, StumbledUpon, and LiveJournal, while being incredibly sensitive to people’s privacy.

We very consciously leveraged FOAF technology early on in our project during the time we collected our interest data. We collected analyzed, correlated millions of on-line profiles, and as a result of this process we developed the InterestMatrix semantic database.

Read more

Phase2 Wants to Push Some Semantic Buttons

Jennifer Zaino Contributor

Phase2 Technology is focused on helping publishers move into the realm of linked data through efforts such as its work with Thomson Reuters to deliver Calais modules for integration with content management system Drupal. Last week, it announced a deal with Apture to enable publishers using Drupal to take advantage of Apture’s platform to include rich multimedia content by simply clicking on the item to embed. That’s perhaps more of a semantic play for the future than for today.

“Apture has essentially what we see as great eye candy,” says Jeff Walpole, managing partner at Phase2 . “It gives an editor the ability to create a little more dynamic inline linking. It lets an editor select things, and pull up other multimedia- oriented resources to define that term a little bit deeper.”

Apture basically lets users combine multimedia content from around the web, together in one place, associating an item with related media to build a web of information that helps users learn more about the topic the publisher is writing about.

“It’s not a true semantic play at this point,” says Walpole, but he sees possibilities in additional intelligence being added to Aperture’s technology. “They are increasingly providing smart tools to semantically link in the things that you are interested in. That technology is fairly immature at this point, but it can go farther. What’s important for now is the tool lets you link to things in an interesting way.”

And Phase2 is indeed very interested in finding interesting “whiz-bang” semantic web technologies to incorporate into Drupal. What might the future hold? “I think another thing that’s really interesting and taking off in Drupal is more content exchange technologies,” says Walpole. “Part of the semantic web concept is obviously linking disparate pieces of information together. From a pure content management system perspective, that’s great. But it needs to be coupled with the fact that people just want to be able to borrow content from each other. We don’t make huge distinctions anymore about what site is serving content. Content exchange as a theme interplays very nicely with semantic web technologies.”

One of the problems to solve here will be how to keep ownership straight. “But that’s the same problem the semantic web has with itself. If you borrow from everyone else and link to everyone else, where does your content end and someone else’s begin?”

Read more