Posts Tagged ‘Evan Sandhaus’

A Look Inside The New York Times’ TimesMachine

nytAdrienne Lafrance of The Atlantic reports, “One of the tasks the human brain best performs is identifying patterns. We’re so hardwired this way, researchers have found, that we sometimes invent repetitions and groupings that aren’t there as a way to feel in control. Pattern recognition is, of course, a skill computers have, too. And machines can group data at scales and with speeds unlike anything a human brain might attempt. It’s what makes computers so powerful and so useful. And seeing the structural framework for patterns across vast systems of categorization can be enormously revealing, too.” Read more

Structured Data In The Spotlight At The New York Times

Photo credit : Eric Franzon

Photo credit : Eric Franzon

In the winter of 2012, The New York Times began its implementation of the schema.org compatible version of rNews, a standard for embedding machine-readable publishing metadata into HTML documents, to improve the quality and appearance of its search results, as well as generate more traffic through algorithmically generated links. The semantic markup for news articles brought to its web pages structured data properties to define author, the date a work was created, its editor, headline, and so on.

But according to a leaked New York Times internal innovation report that appears here, there’s more work to be done in the structured data realm as part of a grand plan to truly put digital first in the face of falling website and smartphone app readership and hotter competition from both old guard and new age newsrooms and social media properties that are transforming how journalism is delivered for an audience increasingly invested in mobile, social, and personalized technologies.

The report was put together with insights from parties including Evan Sandhaus, director for search, archives and semantics at The NY Times, who was instrumental in the rNews/schema.org effort as well as the TimesMachine relaunch, a digital archive of 46,592 issues of The New York Times whose use includes surrounding current news stories with context. While the report notes that the Gray Lady has not been standing still in the face of its challenges, citing newsroom advances to grow audience with efforts such as using data to inform decisions, it needs to do more – faster – to make it easy to get its content in front of digital readers.

Read more

Summary of 11th International Semantic Web Conference

Big Graph Data Panel at ISWC 2012

Big Graph Data Panelists (L to R): Mike Stonebraker, John Giannandrea, Bryan Thompson, Tim Berners- Lee, Frank van Harmelen

Last week, the 11th International Semantic Web Conference (ISWC 2012) took place in Boston. It was an exciting week to learn about the advances of the Semantic Web and current applications.

The first two days, Sunday November 11 and Monday November 12, consisted of 18 workshops and 8 tutorials. The following three days (Tuesday November 13 – Thursday November 15) consisted of keynotes, presentation of academic and in-use papers, the Big Graph Data Panel and industry presentations. It is basically impossible to attend all the interesting presentations. Therefore, I am going to try my best to summarize and offer links to everything that I can.

Read more

Catching Up With rNews At NYC SemTech

What’s the latest news about rNews ? Attendees at the SemTech event in NYC Tuesday had a chance to find out.

“The future of rNews 1.0 is rNews .1.1,” said Stuart Myles, deputy director of schema standards at the Associated Press who also heads up the International Press Telecommunications’ Council’s Semantic Web work. At next week’s IPTC meeting a vote will be taken on V. 1.1, with its adoption the hopeful outcome.

Read more

Dynamic Semantic Publishing for Beginners, Part 2

Even as semantic web concepts and tools are underpinning revolutionary changes in the way we discover and consume information, people with even a casual interest in the semantic web have difficulty understanding how and why this is happening.  One of the most exciting application areas for semantic technologies is online publishing, although for thousands of small-to-medium sized publishers, unfamiliar semantic concepts are too intimidating to grasp the relevance of these technologies. This three-part series is part of my own journey to better understand how semantic technologies are changing the landscape for publishers of news and information.  Read Part 1.

—-

News and Media Organizations were well represented at the Semantic Technology and Business Conference in San Francisco this year.  Among the organizations presenting were the New York Times, the Associated Press (AP), the British Broadcasting Co. (BBC), Hearst Media Co., Agence France Press (AFP), and Getty Images.

It was interesting to note that, outside of the New York Times, which has been publishing a very detailed index since 1912, many news organizations presenting at the conference did not make the extensive classification of content a priority until the last decade or so.  It makes sense that, in a newspaper publishing environment, creating a detailed and involved index that guides every reader directly to a specific subject mentioned in the paper must not have seemed as critical as it does now– it’s not as though the reader was likely to keep the newspaper for future reference material– so the work of indexing news content by subject as a reference was left for the most part for librarians to do well after an article was published.

In the early days of the internet, categorization of content (where it existed) was limited to simple taxonomies or to free tagging.  News organizations made rudimentary attempts to identify subjects covered by content, but  did not provide much information  about relationships between these subjects.   Search functions matched the words in the search to the words in the content of the article or feature.   Most websites still organize their content this way.

The drawbacks of this approach to online publishing is that it doesn’t make the most of the content “assets” publishers possess.    Digital content has the potential to be either permanent or ephemeral– it can exist and be accessed by a viewer for as long as the publisher chooses to keep it, and many news organizations are beginning to realize the value of giving their material a longer shelf life by presenting it in different contexts.   If you have just read an article about, say, Hillary Clinton, you would might be interested in a related story about the State Department, or perhaps her daughter Chelsea, or her husband Bill….   But how would any content management system be able to serve up a related story if no one had bothered to indicate somewhere what the story is about and how these people and/or concepts are related to one another?

Read more

SemTech’s Schema.org Panelists Talk Openness, Adoption, Interoperability

Panelists: Ivan Herman, Moderator, Dan Brickley, R.V. Guha, Peter Mika, Steve Macbeth, Jeffrey Preston, Alexandre Shubin, Evan Sandhaus

Panelists: Ivan Herman (Moderator), Dan Brickley, R.V. Guha, Peter Mika, Steve Macbeth, Jeffrey Preston, Alexandre Shubin, Evan Sandhaus

A packed room at the Semantic Tech & Business Conference in San Francisco played host to the much-anticipated Schema.org panel on Wednesday morning. As W3C semantic activity lead and moderator Ivan Herman had hoped (see this article), the discussion didn’t get bogged down in a duel between RDFa and microdata, but rather emphasized some important accomplishments of the last year and looked forward to future work.

As Herman put it, the only discussion he wanted to have around RDFa was to announce that the proposed RDFa 1.1 recommendations are expected to be published as official W3C standards Thursday, and that there had been a lot of interaction with the schema.org folks to make this useable for them as well.

Wednesday’s panel was composed of: Dan Brickley, of Schema.org at Google;  R.V. Guha of Google;  Steve Macbeth of Microsoft; Peter Mika ofYahoo!; Jeffrey W. Preston of Disney Interactive Media Group; Evan Sandhaus of The New York Times Company; and Alexander Shubin of Yandex.

Here are highlights of what took place:

Read more

Expert Schema.org Panel Finalized for #SemTechBiz San Francisco Program

Q: What do Google, Microsoft, Yahoo!, Yandex, the New York Times, and The Walt Disney Company have in common?

A: schema.org

On June 2, 2011, schema.org was launched with little fanfare, but it quickly received a lot of attention. Now, almost exactly one year later, we have assembled a panel of experts from the organizations listed above to discuss what has happened since and what we have to look forward to as the vocabulary continues to grow and evolve, including up-to-the-minute news and announcements. The panel will take place at the upcoming Semantic Technology and Business Conference in San Francisco.

Moderated by Ivan Herman, the Semantic Web Activity Lead for the World Wide Web Consortium, the panel includes representatives from each of the core search engines involved in schema.org, and two of the largest early implementers: The New York Times and Disney. Among the topics we will discuss will be the value proposition of using schema.org markup, publishing techniques and syntaxes, vocabularies that have been mapped to schema.org, current tools and applications, existing implementations, and a look forward at what is planned and what is needed to encourage adoption and consumption.

Panelists:

photo of Ivan Herman Moderator: Ivan Herman
Semantic Web Activity Lead,
World Wide Web Consortium
Photo of Dan Brickley Dan Brickley
Contractor,
schema.org at Google
Photo of John Giannandrea John Giannandrea
Director Engineering,
Google
Photo of Peter Mika Peter Mika
Senior Researcher,
Yahoo!
Photo of Alexander Shubin Alexander Shubin
Product Manager,
Head of Strategic Direction,
Yandex
Photo of Mike Van Snellenberg Mike Van Snellenberg
Principal Program Manager,
Microsoft/Bing
Photo of Evan Sandhaus Evan Sandhaus
Semantic Technologist,
New York Times Company
Photo of Jeffrey Preston Jeffrey W. Preston
SEO Manager,
Disney Interactive Media Group

These panelists, along with the rest of the more than 120 speakers from SemTechBiz, will be on-hand to answer audience questions and discuss the latest work in Semantic Technologies. You can join the discussion by registering for SemTechBiz – San Francisco today (and save $200 off the onsite price)

 

All the rNews That’s Fit to Print

Evan Sandhaus reports for the New York Times that rNews has finally arrived. He explains, “On January 23rd, 2012, The Times made a subtle change to articles published on nytimes.com. We rolled out phase one of our implementation of rNews – a new standard for embedding machine-readable publishing metadata into HTML documents. Many of our users will never see the change but the change will likely impact how they experience the news. Far beneath the surface of nytimes.com lurk the databases — databases of articles, metadata and images, databases that took tremendous effort to develop, databases that the world only glimpses through the dark lens of HTML.” Read more

rNews 1.0 is an Official Standard!

[UPDATE - November 9, 2011: the IPTC rNews version 1.0 documentation is now available.]

rNews presentastion at Schema.org event

Evan Sandhaus, New York Times (seated) and Andreas Gebhard, Getty Images, present rNews.

Today (Oct. 7, 2011), at a gathering of the International Press Telecommunications Council (IPTC), rNews took the step from being a proposal to being a formal standard. rNews was created by the IPTC and made its public debut earlier this year as a proposal for using RDFa to annotate news-specific metadata in HTML documents.

Congratulations to the IPTC and the leaders of the rNews standardization effort: Andreas Gebhard (Getty Images), Evan Sandhaus (New York Times), and Stuart Myles (Associated Press).

Read more

Schema.org: First, The Good News

When it comes to schema.org, there’s some good news – and some ‘eh’ news.

Let’s start with the positive stuff. Today at the schema blog, the news was released that schema.org has added to its NewsArticle and related types such as CreativeWork new properties for mark-up based on the rNews standard from the International Press Telecommunications Council (IPTC).

Read more

NEXT PAGE >>