Posts Tagged ‘Thomas Tague’

Phase2 Hires Thomas Tague, Former Reuters CTO, as Chief Operating Officer


ALEXANDRIA, VA–(Marketwired – Feb 18, 2014) – Phase2, a leading digital content strategy, design, and technology firm, today announced key additions to its executive leadership team with the hiring of Thomas Tague to the newly created role of Chief Operating Officer (COO). The addition of the COO function is part of an executive realignment designed to focus on the needs of Phase2′s clients and team members as the company continues to expand its operations. Tague joins the company from Thomson Reuters, where he served as Chief Technical Officer for Media Product and Support, and was responsible for bringing together product management, engineering, and customer support for a half billion-dollar product portfolio for the news agency. Read more

Have Semantic Technologies Crossed the Chasm Yet?


This article kicks off a series of interviews on Semantic Technologies in the MIT Entrepreneurship Review with industry thought leaders including Thomas Tague (Thomson Reuters), Chris Messina (Google), David Recordon (Facebook), Will Hunsinger (Evri) and Jamie Taylor (Metaweb).

At first sight, the answer is yes. I recently attended the Semantic Technology Conference in San Francisco. What had begun in 2005 as a 300-person conference has grown into a 5-day event with an amazing depth both of workshops and panels and over 1,300 participants this year. The conference is organized by Semantic Universe, an online platform with the goal of “educating the world about semantic technologies and applications”.

I have had the opportunity to talk to some of the key actors and innovators that have pushed semantic technologies and linked data forward over the past years since the term “Semantic Web” was first coined by Sir Tim Berners-Lee of the World Wide Web Consortium (W3C). The term takes on different meanings in different contexts: to some it is about representation of information in certain well-defined formats to make it machine-readable and easy to interpret; to others it is about web services and the aggregation of information to create valuable applications for users, while still others would highlight the artificial intelligence aspect and its use in tackling complex problems.

I have been personally drawn to the field of semantic technologies for some time, realizing the impact that these technologies will have on the way we consume information online as well as on the possibilities from an enterprise perspective. One thing I realized at the conference was that a lot of things that we take granted today, like online recommendations, are already powered by semantic technologies. In fact, a lot of the conversations happening in the hallways, between sessions, were not just around technical topics like how to best construct OWL ontologies or how to structure SPARQL queries, but rather about business issues like designing the right monetization models, improving e-commerce with semantic technologies, gauging the potential business impact of Facebook’s Open Graph, Twitter annotations or Google’s rich snippets. The New York Times, BBC, Newsweek, Tesco, Best Buy are some examples of companies that have been building and are relying on semantic technologies. To me, these are all strong indicators that semantic technologies have reached the tipping point.

Jamie Taylor, Minister of Information at Metaweb, the company behind Freebase, sees clear indications that semantic technologies have become more mainstream:  “Just the sheer size of the conference has increased pretty dramatically, as well as the diversity of people who actually have commercial offerings in terms of tools that matter to your typical webmaster, your typical content manager.” While there is still a strong academic track to semantic technologies, Taylor says, “it’s very interesting that sometimes semantic technologies have met the Web 2.0 lightweight user contribution-type model and as you add semantics into these types of systems – fairly lightweight semantics – all of a sudden they start getting much greater benefit.”

Managing one of the best-known semantic technology start-ups, Will Hunsinger, CEO of Evri, tells me that he has “seen a lot more activity in the last 12 month”. Naming Microsoft’s acquisition of Powerset and Apple’s acquisition of Siri as examples, he also points out that these “transactions have given validation that the technology is here and ready, but also that there is a path to liquidity.” One advice for startups and companies in the semantic technologies sector is to focus less on the technology itself and spend more time understanding consumers’ needs by asking themselves: “What does this technology do better than what’s out there such that you are going to solve a real problem”.  For example, at Evri, he adds “we create a better experience for the consumer applying the technology where it actually has a distinct advantage over keyword e.g. delivering precise results around general topics like “movies” or “reality tv”, understanding meaning and context (e.g. why is a particular entity popular right now) or even enabling consumers to follow topics over time”.

From a technological perspective, the recent developments around RDFa, a simpler version of RDF which allows users to add metadata to their content, will further accelerate the growth of the Semantic Web. Drupal 7, one of the biggest open source content management systems used on hundreds of thousands of websites, comes with major RDFa functionality. The latest HTML5 draft has RDFa support in it. Facebook’s Open Graph protocol is based on RDFa. Google Rich Snippets support RDFa. According to a recent GigaOM report, Twitter Annotations are looking to use it.

The benefits of semantic technologies with respect to making online search better are most obvious and to some extent already observable today. David Recordon, Senior Open Programs Manager at Facebook, sees some powerful applications in search, essentially “giving you a filter into the world based on your friends”. Thanks to semantic technologies built into the Facebook platform “developers [can] build on top of information which people have trusted Facebook with, whether that’s status updates or things they like, people they are connected to […]”. Google’s Open Web Advocate, Chris Messina, told me he agrees that social search will play a key role in the future: “we are starting to see Google integrating Twitter streams in search experience, hopefully providing users with more actionable information, providing a number of different opinions, more contextual data. It is certainly something Google is paying a lot of attention to – information that is contextual to the user, not just generic to the world.”

But what about exploiting the power of the semantic web by pulling in data from different sources, the premise of linked data? Thomas Tague, VP Platform Strategy at Thomson Reuters and in charge of the OpenCalais project, a free service to analyze and extract concepts from user-submitted texts or web sources, told me about the exciting opportunities he sees at the intersection of highly trusted monetized content and free web content. He says that “people are not going to make $100 million bets based on blog postings. But that blog posting may be an outlier, may be an initial indicator, maybe about a layoff at a factory or something like that, that the user can now immediately link back to Thomson Reuters data and gain insight and take action.” While Tague certainly shares the enthusiasm for the growth of semantic technologies and adoption of standards by industry participants, utilization of linked data remains low in his view. Therefore, his short-term outlook with respect to utilization of the linked data cloud, remains rather cautious: “There is a lot of talk about it, but with respect to our linked-data company information, people aren’t picking it up yet very much.”

So what can we expect in the near future? Jamie Taylor tells me that he thinks “the idea that you can aggregate is something very novel: all of a sudden my data is not limited to my data silo.” He distinguishes two types of data: core data, which must be managed by the organization to drive the core business, and context data–such as geo data. He believes that what “semantic technologies allow is in some sense to outsource [context data] to the community for maintenance.”

Overall, there seems to be consensus that as semantic technologies move out of the purely technical corner and beyond the innovators and early adopters in academia and government, content-heavy organizations and users like publishers or e-commerce sites will help these technologies cross the chasm as they see the largest benefit in applying the technology. As pointed out earlier, companies like The New York Times or Best Buy have already begun to build and rely on semantic technologies. As more and more companies start adopting linked data standards and share data in the linked data cloud, we will see more businesses created to derive value from aggregating data across different datasets to provide value to their users.

If this article has sparked your interest into semantic technologies, I can recommend a documentary by Kate Ray, a recent graduate from NYU with a major in Journalism/Psychology, who has contributed to the demystification of the Semantic Web through interviews with thought leaders, including Tim Berners-Lee, Clay Shirky, Chris Dixon, David Weinberger, Nova Spivack, Jason Shellen, Lee Feigenbaum, John Hebeler, Alon Halevy, David Karger and Abraham Bernstein. The clip has been viewed by more than 120,000 people so far. I asked Kate what motivated her to do the documentary: “My dad has been doing semantic web stuff for years, and my entire family never really knew what he was doing, so partly I was trying to make something that all these people here could show to their friends and family. I also had an academic interest in it.” Kate is now working on a company called Kommons, which she describes as a “Q&A forum built on top of Twitter; to let people ask questions to public figures – or anyone – and backing questions you agree with”.

MIT is at the forefront of exploring applications to commercialize linked data and semantic technologies, adding a new seminar, Linked Data Ventures, to the fall curriculum. The class will be taught by an all-star team consisting of Sir Tim Berners-Lee, Dr. Lalana Kagal, K. Krasnow Waterman, as well as Reed Sturtevant and Katie Rae. Computer science and business students will work in small teams to develop prototypes based on Semantic Web technologies.

About The Author

Rene Reinsberg Rene Reinsberg is currently a member of the Entrepreneurship & Innovation program at MIT. His interests span Linked Data, Big Data, Open Data, and social graph analytics.

Thomson Reuters OpenCalais Service Adopted by the Huffington Post, Dailyme and Associated Newspapers’ Mail Online

Pioneering Publishers Tap Semantic Web Service to Speed Editorial Processes, Improve the Reader Experience and Extend Their Reach Across the New Content Economy

New OpenCalais ‘Archive Express’ Service Debuts to Help Other Publishers Get Started; Free Service Tags Large Content Archives in 24 Hours

San Jose, Calif. – The 5th Annual Semantic Technology Conference – June 16, 2009 – Following on the news of its alliance with CNET, Thomson Reuters today announced that The Huffington Post, DailyMe and UK-based Associated Newspapers Ltd.’s Mail Online have integrated the OpenCalais service into their popular news sites and services.

These pioneering publishers join Thomson Reuters and CNET in ushering in a new wave of innovation in digital media and online publishing. They are using OpenCalais to achieve new efficiencies in content operations and editorial processes, speeding the delivery of breaking news to readers. They are also using OpenCalais to reach new milestones in localization, personalization and search engine optimization (SEO).

“OpenCalais enables our editors to more efficiently locate related local stories,” said Paul Berry, CTO, The Huffington Post. “This helps the site meet an important strategic goal: cost-effectively producing regional microsites that ‘super-serve’ communities with the best local news, as we have done in Chicago.”

OpenCalais helps publishers compete. Found at, the free service makes it easy to automate content operations, enhance the value of content, improve the reader experience and extend distribution to new search engines, news aggregators and social media applications.

"OpenCalais helps us to create a picture of a user’s behavior and interests, so that we can personalize the news for them," said Neil Budde, President and Chief Product Officer, DailyMe. “That capability has enabled us to add automated personalization features that both improve our readers’ experience and help optimize ad targeting for our partners.”

Today also marked the debut of Thomson Reuters new OpenCalais ‘Archive Express’ service, which can tag an archive of up to 20 million documents in 24 hours time.

“Tagging archived content is a simple way to get started with OpenCalais, and the fastest way to give old stories new life,” said Thomas Tague, OpenCalais Initiative lead, Thomson Reuters. “It can help publishers repurpose – and even drive incremental revenue from – historical content, and makes it easy to bring archived stories into ‘related stories’ applications, ‘recommended reading’ widgets and more.”

OpenCalais uses natural language processing (NLP) to “read” an article, extracting the ‘who, what, when, where and how’ from the story. Breaking content down into its basic elements makes it easier to manipulate – automating the creation of topic hubs and microsites – and improves its search relevance.

"OpenCalais was originally part of a suite of data mining and SEO solutions we assembled for Mail Online, and our intention was to use it to ‘Sanity Check’ the rest,” said Simon Schnieders, SEO Manager for Associated Newspapers’ Mail Online, “It speaks volumes for the service that we came to rely on OpenCalais for entity extraction."

Thomson Reuters OpenCalais initiative is committed to helping publishers improve their online business results. With OpenCalais, publishers can:

  • Automate: Automatically tag the entities, facts and events in content to increase its value.
  • Enhance: Enrich content with open data from Wikipedia,, Geonames and more.
  • Engage: Optimize the user experience, increase engagement and drive repeat visits.
  • Extend: Increase reach to new search engines, aggregators, ‘related stories’ apps and more.
  • Connect: Compete in tomorrow’s media ecosystem of enriched and interconnected content

Note to attendees of the 5th Annual Semantic Technology Conference: Thomas Tague is a keynote speaker this morning at SemTech; he takes the stage at 8:30 a.m. PT.

Learn more about how CNET and The Huffington Post are using OpenCalais in the SemTech 2009 Publisher Panel with Jim Stanley, Vice President – Products, CBS Interactive – Technology & News; Paul Berry, CTO, The Huffington Post and more. It takes place today at 2 p.m. PT.

The OpenCalais Archive Express service is available today. With shipping, users can expect their archive to be received, tagged and returned to them within one week. Please contact Professional at to get started.


Thomson Reuters Adds ‘Social Tags’ and Spanish Language Support to its OpenCalais Service

Social Tags Use Simple, Everyday Terms to Categorize Stories;
Make it Easy for Editors to Filter News by Human Interest

San Jose, Calif. – The 5th Annual Semantic Technology Conference – June 15, 2009 – Thomson Reuters today announced significant upgrades to its OpenCalais service. The update adds new ‘Social Tags’ – story descriptors in simple, everyday language – and support for Spanish language content to OpenCalais’ core capabilities. It also adds a new ‘Recession Pack’ of facts and events that OpenCalais can extract from news about company actions related to a down economy.

OpenCalais helps publishers compete. Found at, the free service makes it easy to automate content operations, enhance the value of content, improve the reader experience and extend distribution to new search engines, news aggregators and social media applications.

“With these updates, we are increasing the relevance, impact and appeal of the OpenCalais Service worldwide, for everything from content operations to SEO,” said Thomas Tague, Calais Initiative lead, Thomson Reuters. “Social Tags and Spanish language support are two of the most in-demand features with our partner publishers and community of Web developers alike. We are very pleased to be able to meet that demand and quickly bring them to light.”

The new features in the OpenCalais service include:

Social Tags: Social Tags go beyond news categories – such as “lifestyle,” “sports” or “entertainment” – to describe what a story, blog post or document is about using common, conversational terms, such as “gourmet cooking,” “auto racing” or “new movie release.” Social Tags make it easier for editors to filter news by human interest, and to create compelling collections of related content such as microsites that improve search engine optimization (SEO) and bolster reader engagement.

Social Tags are based on sophisticated analysis of an entire document that has been mapped to the OpenCalais knowledgebase as well as Wikipedia. In addition to helping streamline content operations, Social Tags can be used as keywords for ad placement and as metatags for SEO.

Entidad extracción en español: As with French, OpenCalais’ initial support for Spanish language content extends to entity extraction in the top categories, including people, cities, countries company names and more. Extraction of facts and events will follow in 2010.

The Recession Pack of Facts & Events: Given today’s environment, OpenCalais has been tuned to extract a new set of facts and events related to company performance and company actions in a down economy, including accounting changes, labor issues, layoffs, earnings restatements, delayed filings and more.

The Complete Set of IPTC Newscodes: In addition to the new Social Tags, the OpenCalais service now categorizes news stories and documents using one of 17 top-level subject codes from the International Press Telecommunications Council (IPTC) NewsCodes taxonomy.

Enhanced Linked Data URIs (Uniform Resource Identifiers): The top-level company name URIs that OpenCalais returns along with document metadata have been enhanced to reflect ongoing updates of company information as those changes happen in Thomson Reuters Linked Data repository.

In addition, the company URIs OpenCalais returns will now feature links to related entries in TechCrunch’s CrunchBase, and include specific URIs for company officers and company competitors, making it easier to source and navigate these key elements of competitive business information.

Linked Data URIs in JSON: In addition to HTML and RDF, OpenCalais’ Linked Data URIs for companies, geographies and more are now available in the JSON format. Users can retrieve URIs as JSON by appending .json to the URI or calling the OpenCalais service with an appropriate caller type.

Opt-In Storage of Document-Level Metadata URIs: Whereas OpenCalais used to return all document-level URIs to the user, the service will now store them on the user’s behalf, returning them on an opt-in basis as needed. This is helpful for publishers processing large quantities of content and preserves their prerogative to share their document-level identifiers when they see fit.

The updates to the OpenCalais service are being rolled out as a phased auto-upgrade to the existing service. No changes are required on the part of partners or developers.

  1. Phase one (OpenCalais version 4.1) – which goes live today – includes Social Tags, the ‘Recession Pack’ of company facts and events, and more. See release notes here.
  2. Phase two (OpenCalais version 4.2) – available in mid-July – includes support for Spanish language content, enhanced Linked Data URIs and more.