Evan Sandhaus reports for the New York Times that rNews has finally arrived. He explains, “On January 23rd, 2012, The Times made a subtle change to articles published on nytimes.com. We rolled out phase one of our implementation of rNews – a new standard for embedding machine-readable publishing metadata into HTML documents. Many of our users will never see the change but the change will likely impact how they experience the news. Far beneath the surface of nytimes.com lurk the databases — databases of articles, metadata and images, databases that took tremendous effort to develop, databases that the world only glimpses through the dark lens of HTML.” Read more
Posts Tagged ‘rNews’
Online publishers and other content providers have a new analytics tool to help them understand what their readers care about and use that information to better connect them to their sites’ relevant and compelling content. Launching today is Dash, based on the predictive content analytics platform Parse.ly. The technology crawls every article page for Parse.ly’s publisher-partners, and analyzes, in real time and at scale, the text to identify relevant topics to group related content together. Behind this lies natural language processing technology, which uses language queues hidden inside the text to determine its affiliated topics. To date Dash has extracted over 350,000 unique topics through all the URLs is has crawled during private beta for a healthy taxonomy of topics across the web being consumed by users.
[UPDATE - November 9, 2011: the IPTC rNews version 1.0 documentation is now available.]
Today (Oct. 7, 2011), at a gathering of the International Press Telecommunications Council (IPTC), rNews took the step from being a proposal to being a formal standard. rNews was created by the IPTC and made its public debut earlier this year as a proposal for using RDFa to annotate news-specific metadata in HTML documents.
Congratulations to the IPTC and the leaders of the rNews standardization effort: Andreas Gebhard (Getty Images), Evan Sandhaus (New York Times), and Stuart Myles (Associated Press).
A room full of interested parties gathered in Microsoft’s Silicon Valley Campus yesterday to discuss Schema.org, its implications on existing vocabularies, syntaxes, and projects, and how best to move forward with what has admittedly been a bumpy road.
Schema.org, you may recall, is the vocabulary for structured data markup that was released by Google, Microsoft, and Bing on June 2 of this year. The schema.org website states, “A shared markup vocabulary makes easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of sitemaps.org, Bing, Google and Yahoo! have come together to provide a shared collection of schemas that webmasters can use.” (For more history about the roll-out and initial reactions to it, here’s a summary.)
Yesterday was the first time since the Semantic Technology & Business Conference in San Francisco that community members have gathered face-to-face to discuss Schema.org in an open forum. It was a full agenda with plenty of opportunity for debate and discussion.
The media industry has had a complicated relationship with the Web, and that’s putting it kindly. While other sectors pretty quickly realized ways to take advantage of that new thing called the Internet – to sell goods, accelerate supply chains, and build deeper customer relationships – established content providers spent years trying to figure it out. And many still are tussling with big issues, such as whether or not to charge for access to content.
Given the Web’s impact on their business model and their revenues, you can forgive publishers if they might prefer if the darn Internet just stood still for a few minutes and let them catch their breaths and catch up. Since that isn’t about to happen, the thing to do is to make peace with those changes, many of them thanks to Semantic Web technologies – and figure out fast how they’re going to profit from them.
They’ll have an opportunity to do just that at the upcoming Semantic Web Media Summit in New York City, whose speakers will include Michael Dunn, VP and CTO at Hearst Interactive Media on the topic of why media companies should be interested in this critical part of the Web 3.0 world.
Dunn sees a number of reasons for using Semantic Web technologies as the means for structuring the wealth of content that publishers produce. There’s improving its discoverability by the world via search and social, of course, but it matters for internal operations, too. And add to that the relationship with online advertising so that content can be better monetized.
I hate to even mention how quickly Summer is passing, but as we head into August, it’s time to start making plans for the busy Fall event season. September is particularly full of Semantic Tech events.
September 14, in New York City, the Semantic Web Media Summit will take place. A half-day meeting focused on uses of Semantic Web in media, advertising, and publishing, the event is produced by SemanticWeb.com, Lotico.com and our parent company, MediaBistro. With a keynote by Mike Dunn, CTO of Hearst Interactive, and contributions from a stellar group of presenters, the program promises to be a must-attend event for anyone in the New York area interested in how Semantic Technology is changing the media world. OpenAmplify is sponsoring the conference.
September 21-23, DC-2011, the eleventh International Conference on Dublin Core and Metadata Applications, will take place at the National Library of the Netherlands in The Hague.
September 26-27, The London Semantic Technology and Business Conference (#SemTechBiz) takes place at the Hotel Russell. This two-day executive conference is designed for business and technology executives who need to learn what semantic technologies are and how to take advantage of semantics in their enterprise and web-based systems. Attendees will further their technical understanding in introductory sessions and learn from the Keynote speakers John O’Donovan (Press Association), Martin Hepp (Hepp Research), Steve Harris (Garlik), and Dennis E. Wisnosky, U.S. Department of Defense.
A recent article reports, “Draft version 0.5 of rNews, a standard model for embedding machine-readable metadata in online news, was approved at the International Press Telecommunications Council’s [IPTC] Annual General Meeting in Berlin, Germany. This version clarifies, simplifies and expands the rNews model and incorporates much of the feedback that the IPTC has received about the first draft. The IPTC is also releasing the recommended RDFa implementation for rNews 0.5 and plans to provide mappings to other markup mechanisms, such as HTML5 microdata and JSON. This will give publishers a choice of how to implement the model, using the technologies that best meet their needs.” Read more
If it’s been a while since you have looked in on the conference program for SemTech SF, you may have missed the addition of some significant, exciting sessions. Recent additions to this year’s conference include:
Aditya Kalyanpur, research staff member for IBM Research will lead a session entitled “Building Watson: An Overview of the DeepQA Project for the Jeopardy! Challenge,” a discussion of the the DeepQA technology and describe what it was like to build a Watson, the computer system that won on Jeopardy!.
“The rise of the Interest Graph: How Semantic Technology Will Lead What’s Next for the Social Web.” Dave S Copps, CEO of PureDiscovery Corporation, will discuss the reasons why transactional searches will be replaced by social filtering. The real power and potential of semantic technologies will be unleashed as semantic vendors integrate the richness of the data being generated by the social graph (Twitter, Facebook, etc.) to create networks that share more than just a relationship.