Photo credit : Eric Franzon

Photo credit : Eric Franzon

In the winter of 2012, The New York Times began its implementation of the schema.org compatible version of rNews, a standard for embedding machine-readable publishing metadata into HTML documents, to improve the quality and appearance of its search results, as well as generate more traffic through algorithmically generated links. The semantic markup for news articles brought to its web pages structured data properties to define author, the date a work was created, its editor, headline, and so on.

But according to a leaked New York Times internal innovation report that appears here, there’s more work to be done in the structured data realm as part of a grand plan to truly put digital first in the face of falling website and smartphone app readership and hotter competition from both old guard and new age newsrooms and social media properties that are transforming how journalism is delivered for an audience increasingly invested in mobile, social, and personalized technologies.

The report was put together with insights from parties including Evan Sandhaus, director for search, archives and semantics at The NY Times, who was instrumental in the rNews/schema.org effort as well as the TimesMachine relaunch, a digital archive of 46,592 issues of The New York Times whose use includes surrounding current news stories with context. While the report notes that the Gray Lady has not been standing still in the face of its challenges, citing newsroom advances to grow audience with efforts such as using data to inform decisions, it needs to do more – faster – to make it easy to get its content in front of digital readers.

“We need to think more about resurfacing evergreen content, organizing and packaging our work in more useful ways and pushing relevant content to readers,” it writes. “And to power these efforts, we should invest more in the unglamorous but essential work of tagging and structuring data.”

It concludes that the paper of record has not “updated its structured data to meet the changing demands of our digital age and is falling far behind as a result. Without better tagging, we are hamstrung in our ability to allow readers to follow developing stories [for example, it never made a tag for Benghazi], discover nearby restaurants that we have reviewed, or even have our photos show up on search engines.”

Waiting any longer to improve its use of structured data hurts everything from its search engine rankings to automating the sale of its photos to serving up content relevant to mobile users’ locations, it writes. Says the report, “We must expand the structured data we create, which is still defined by the needs of the Times Index rather than our modern digital capabilities.”

Sandhaus contributed his thinking on the topic, with the report quoting him as saying:  “We don’t tag the one thing” – news events – “that people use to navigate the news.” The report cites Sandhaus as “perhaps the most passionate advocate of structured data at The Times,” whose work ensures that journalists’ work is tagged and archived for future use.

Below is a table compiled for the report of structured data that would allow the NY Times to make better use of its content:

nytimes table

The report acknowledges that it harbors no one single transformational idea, and also the challenges that will face its various recommendations. At the same time, it says that there is a new sense of openness and opportunity across the organization, evidenced in its overall goals that include unlocking the power of data, strategizing for growth, speed and agility, and having One NYT.