A recent interview with Matthew Russell, co-founder and Principal of Zaffra discusses the limitations and possible applications of sentiment analysis. Russell states, “Think of sentiment analysis as “opinion mining,” where the objective is to classify an opinion according to a polar spectrum. The extremes on the spectrum usually correspond to positive or negative feelings about something, such as a product, brand, or person.”

When asked about the limitations of sentiment analysis, Russell said, “Like all opinions, sentiment is inherently subjective from person to person, and can even be outright irrational. It’s critical to mine a large — and relevant — sample of data when attempting to measure sentiment. No particular data point is necessarily relevant. It’s the aggregate that matters. An individual’s sentiment toward a brand or product may be influenced by one or more indirect causes; someone might have a bad day and tweet a negative remark about something they otherwise had a pretty neutral opinion about. With a large enough sample, outliers are diluted in the aggregate. Also, since sentiment very likely changes over time according to a person’s mood, world events, and so forth, it’s usually important to look at data from the standpoint of time.”

Russell continued, “As to sarcasm, like any other type of natural language processing (NLP) analysis, context matters. Analyzing natural language data is, in my opinion, the problem of the next 2-3 decades. It’s an incredibly difficult issue, and sarcasm and other types of ironic language are inherently problematic for machines to detect when looked at in isolation. It’s imperative to have a sufficiently sophisticated and rigorous enough approach that relevant context can be taken into account. For example, that would require knowing that a particular user is generally sarcastic, ironic, or hyperbolic, or having a larger sample of the natural language data that provides clues to determine whether or not a phrase is ironic.”

Image: Courtesy Flickr/ GRwitters