One of the issues organizations confront when they take to semantically processing data is how to handle all the results of that work. The output of extracting entities, tagging concepts, classifying page topics and parsing sentiment makes its way to a data store that can get pretty big, making for intense storage and analytics demands.
Orchestr8’s NLP- and machine learning-based AlchemyAPI service, which just last week added sentiment analysis to its retinue, gives content providers, social media monitoring companies, and contextual advertising sectors the tools for all of the above that leads to those big data stores, and now it has in beta a solution for dealing with the demands that creates, too. Its Alchemy SAS (Semantic Analysis System) – a name that is subject to change, by the way – processes content, takes what is generated thanks to the functionality within the AlchemyAPI, and stores and organizes the content analysis and meta-data results into a cloud data store for customers to query and discover patterns in.
“We process and store and give them the tools to find trends to go with that data,” says project engineer Shaun Roach. “That deals with everything the AlchemyAPI produces and storing that massive data and being able to quickly query that is a big technical challenge.” Consider this in context of sentiment analysis, where users might want an aggregate view, across hundreds of millions of pieces of data, of how an issue or topic has trended in social media, querying for every hour for the last 24 hours. An etailer, for instance, can do queries like show sentiment on the RIM Blackberry for the past week, for example, “and if it’s trending positive maybe promote the Blackberry to the first page on their web site as a special offer, and do it all programmatically. It’s how to organize everything into a data store to query and look for patterns that is the main challenge.”
The sentiment analysis functions are more directed to the social media monitoring and advertising set among AlchemyAPI’s customers than its publishing clients (Roach says three of the six major social monitoring services already use its technology in their back-ends). And how does the vendor see its solution as differentiated from the growing number of providers in the market? Roach credits it’s having an established NLP service and a scalable cloud infrastructure that supports some 20 million API calls a day currently as a starting point for clients who are making their way through the morass of some 150 million tweets a day and a half-million to one million blog posts. “Those are large quantities of information, so you need a service robust enough to handle that. Ours is,” he says. “We’ve handled this type of volume in the past and we have the infrastructure to handle someone who wants to send a firehose of data through our service and get sentiment on the other side.”
The other one he points to is targeted sentiment, delivering accuracy on more than a single sentiment value, especially in news blogs and articles where there is complex interaction between subjects of sentiment and the sentiments themselves. The AlchemyAPI adds sentiment analysis to its keyword, document concept and entity-level functions. So, an article that might reference Steve Jobs, Apple and Microsoft would be able to determine whether it’s positive or negative on each of those points, Roach says, as well as the perspective at-large.
He also points to the help machine learning gives in terms of determining whether a word is positive or negative in context – a video game that’s sick is good, a person who’s sick isn’t. But there are still plenty of nuts to crack in sentiment analysis, he acknowledges, like dealing with negations in tweets (e.g. someone retweets a post that’s negative about something but the retweet is actually to disagree with that point of view). “That challenge is something we’re getting better on. We can look for negations to flip sentiment but the term isn’t always exactly clear,” he says.
There are commercial, for-fee options for access to the AlchemyAPI but it also has a free model that people can try out that allows up to 1,000 API calls per day, and even increased volumes for researchers and not-for-profits.
- Automatic Hashtags & Machine Learning: The New Google+
- NLP Company Versus IO Raises $2.8M
- LinguaSys Helps International Bank Deal with Compliance, Security and Fraud
- MarkLogic 7 Vision: World-Class Triple Store and World-Beating Information Store