News publishing outlets stand to benefit from adopting Semantic Web technologies, and now there’s a lightweight way for them to begin moving in that direction, too.
The International Press Telecommunications Council (IPTC) recently introduced rNews 0.1, a set of specifications and best practices for using RDFa to embed news-specific metadata (headlines, bylines, publication dates and so on) into HTML documents. It hopes rNews will become a standard in the industry for conveying through to browsers and into HTML documents the deep structure and explicitly modeled content that exists in publishers’ back-end data layers. The wider its adoption across news channels, the greater the chance of innovative apps cropping up that can help publishers increase engagement with their audiences, according to rNews’ developers.
As an example, proposes Stuart Myles, deputy director of schema standards at the Associated Press who’s heading up the IPTC Semantic Web work, consider how marking up news content in rNews can open it up for interesting re-purposings and re-contextualizations. A number of publishers are experimenting with APIs that give others access to data from just their one publication, he says. But if multiple publishers adopt rNews and support its standardization, there’s the potential for developers to more easily create applications that mix and match information from across different venues. “Someone could build an application that says to alert them when other photos with certain words in the caption appear and are published after this date and time,” he says, “or an application that shows photos only from the New York Times, the AP and Getty Images.” Such apps could help drive traffic back to publishers’ web sites and enable greater interactions with the news they deliver.
Right now, building such “apps that aggregate” is a job that essentially requires developers to write software to understand each web site from which they’d like to pull data – and that often means small news sites get left out in the cold as not being worth the effort. As more sites implement rNews, it can lower the bar for “what developers have to invest in building those aggregation experiences,” says Evan Sandhaus, the lead architect for the New York Times’ semantic platform and that publisher’s delegate to the IPTC.
What Else Gets Better?
Other potential benefits that could argue for rNews’ adoption by publishers include the obvious one of better search engine experiences. to some that are a bit more speculative. The right metadata – and correctly labeled metadata – could help in winnowing the scope of ads to those that are appropriate for a particular article. “One of the holy grails of the publishing industry was finding ways to show more targeted ads, such as by gathering information about me, the viewer, to identify me and serve ads that target my interests. And many of us on the web are somewhat opposed to that. But what is a good interim solution is to tie ads to the context of the web pages people view,” says Andreas Gebhard, manager, Getty Images Global Picture Desk and that company’s delegate to the IPTC. “If you could expose automatically what this particular text on the page is about, and a major ad network would scrape that information and serve ads that are pertinent to that or even avoid embarrassing things that shouldn’t be served, that might be a very quick win. It might prove to be very beneficial to advertisers, publishers and even readers.” Not that rNews is seeking to supplant some techniques and technologies, including semantic ones, that already are in the ad space, says Myles, but it could be used to support such approaches.
Another potential (and still-in-need-of-greater-exploration) use: Improving the experiences for the site-impaired. “Screen readers have the same problems search engines do to decompose the semantics of the article on the page,” says Sandhaus.
Of course, today a lot of news publishers that are looking for new ways to grow reach and revenue in these extremely challenging times are focusing on things like building a new iPhone app. So how are they to find the time and resources to put behind rNews? The good news, say its proponents, is that they don’t need much of either. “We want a CIO to look at this as taking three days of developing and testing, not as a month-long infrastructure investment,” Sandhaus says. What’s needed, according to Gebhard, is mostly making some adjustments to their main publishing templates, not a massive commitment of funds and technology. “It was on our minds throughout the design process that we wanted to make this more of a ‘shrug-offable’ investment,” he says. “As rNews is currently structured, with some basic understanding and primers we are offering, I think it’s reasonable that an up-to-date web developer or designer could easily get the hang of it and actually implement it.”
Myles adds that those outfits that have been building tools on the hNews microformat will find it relatively easy to support rNews, too. So, speaking of hNews, what of the more than 1,000 news sites that are already using it? No reason you can’t mark up content with both, Myles says.
“They’re different technical approaches so they are not in conflict. Adopting rNews doesn’t stop you from doing other things within your web page as well,” he says. “Part of it is that there is an existing set of sophisticated tools for extracting information from web pages using RDFa, so [rNews] is an important set of things to support or add. Also some things that we’re adding to rNews are not directly expressive in hNews, like comments.”
Adds Sandhaus, “We see this standard as just the latest step in a long story in the structure of news content becoming increasingly accessible to the channels we want to get it to.”
If you’re in the New York area, you might want to check out next week’s Meetup at the New York Times offices to learn more abour rNews.
- MarkLogic 7 Vision: World-Class Triple Store and World-Beating Information Store
- Session Spotlight: A Host of Expert Panels at SemTechBiz SF
- Drupal 7 And The Linked Data Connection: Making For Smarter Web Experiences
- Improving Health Data Management