Analytics

Alta Plana Takes The Pulse Of Text Analytics

wordcloudSeth Grimes, president and principal consultant of Alta Plana Corp. and founding chair of the Sentiment Analysis Symposium, has put together a thorough new report, Text Analytics 2014: User Perspectives on Solutions and Providers. Among the interesting findings of the report is that “growth in text analytics, as a vendor market category, has slackened, even while adoption of text analytics, as a technique, has continued to expand rapidly.”

Grimes explains that in a fragmented market, consisting of everything from text analytics services to solution-embedded technologies, the opportunities for users to practice text analytics is strong, but that increasingly text analytics is not the main focal point of the solutions being leveraged.

Reflecting the diversity of options, respondents listed among their providers a number of open-source offerings such as Apache OpenNLP and GATE, API services such as AlchemyAPI and Semantria, and enterprise software solution and business suite providers like SAP. The word cloud above was generated by Alta Plana at Wordle.net to show how users responded to the question of companies they know provide text/content analytics functionality. Nearly 50 percent of users are likely to recommend their most important provider.

Read more

OpenText Takes Next Steps In Automatic Content Classification

otextOpenText yesterday made its secure file sharing and synchronization product, Tempo Box, available for free to customers using its OpenText Content Suite enterprise information management tool.

“A lot of our customers have major concerns about employees sharing documents with cloud tools like Dropbox,” says Lubor Ptacek, vp of strategic marketing. They want them to be available, synched and sharable across all their devices, but using such services can create security and compliance problems. By deploying Tempo Box on top of their existing infrastructure, at no charge to all internal employees and any external parties they may need to share content with, companies get a seamless and cost-effective way to share files in the cloud without compromising security, records management requirements and storage optimization, he says – “the things that enterprise customers care about, especially those operating in regulated environments.”

Among those capabilities is applying automatic content classification, which is usually required for records management reasons – for example, helping companies determine if a document is an employee record they must keep for five years or a tax record they have to hold for seven years. That under-the-hood classification engine is an outgrowth of OpenText’s acquisition a few years back of text mining, analytics and search company Nstein. Since the acquisition, says Ptacek, the company has been looking at ways to apply the technology to specific business problems and make it part of its applications.

Read more

GraphLab Create Aims To Be The Complete Package For Data Scientists

glabData scientists can add another tool to their toolset today: GraphLab has launched GraphLab Create 1.0, which bundles up everything starting from tools for data cleaning and engineering through to state-of-the-art machine learning and predictive analytics capabilities.

Think of it, company execs say, as the single platform that data scientists or engineers can leverage to unleash their creativity in building new data products, enabling them to write code at scale on their own laptops. The driving concept behind the solution, they say, is to make large-scale machine learning and predictive analytics easy enough that companies won’t have to hire huge teams of data scientists and engineers and build the big hardware infrastructures that lie behind many of today’s Big Data-intensive products. And, the data scientists and engineers that do use it won’t need to be experts at machine-learning algorithms – just experienced enough to write Python code.

Read more

Versium Leverages Microsoft Azure Machine Learning For New Predictive GivingScore Solution To Improve Fundraising

versiumpixVersium, which earlier this year launched its Predictive FraudScore solution (covered here) today releases its Predictive GivingScore solution, designed to help charitable institutions and political organizations better predict who is likely to donate, be a repeat donator, or make the more significant contribution. PredictiveGiving Score is the latest of the company’s predictive Score products, which also include churn, social influencer and shopper scoring – and it’s by no means the last.

It was built with Microsoft Azure Machine Learning, a managed cloud service for building predictive analytics solutions publicly unveiled just a short time ago. CEO Chris Matty says that platform is an aid to Versium in rapidly building its new score solutions. (Just shy of ten Versium scoring products are currently in use or in development.) Azure ML, Matty notes, contains dozens of machine learning algorithms and mathematical computation models it leverages to easily and effectively experiment, create and tune models to get the highest accuracy in predictive scoring solutions.

“Once we have a score built it just takes little tuning. But when we are building a new score we need to look at some different models and see what works better,” he says. “We want to move quickly by evaluating the different models, and we can visualize very easily the process of building the predictive model.”

Read more

Daedalus Takes Meaning-As-A-Service To Excel, GATE And CMS Systems

meaningasaserviceDaedalus (which The Semantic Web Blog originally covered here) has just made its Textalytics meaning-as-a-service APIs available for Excel and GATE (General Architecture for Text Engineering), a JAVA suite of tools used for natural language processing tasks, including information extraction in many languages. Connecting its semantic analysis tools with these systems is one step in a larger plan to extend its integration capabilities with more API plug-ins.

“For us, integration options are a way to lower barriers to adoption and to foster the development of an ecosystem around Textalytics,” says Antonio Matarranz, who leads marketing and sales for Daedalus. The three main ecosystem scenarios, he says, include personal productivity tools, of which the Excel add-in is an example, and NLP environments, of which GATE is an example. “But UIMA (Unstructured Information Management Applications) is also a target,” he says. The list also is slated to include content management systems and search engines, among them open source systems like WordPress, Drupal, and Elasticsearch.

Read more

Extracting Value from Big Data Requires Machine Learning

Involuntary Commitment

James Kobielus of InfoWorld recently wrote, “Machine-generated log data is the dark matter of the big data cosmos. It is generated at every layer, node, and component within distributed information technology ecosystems, including smartphones and Internet-of-things endpoints… Clearly, automation is key to finding insights within log data, especially as it all scales into big data territory. Automation can ensure that data collection, analytical processing, and rule- and event-driven responses to what the data reveals are executed as rapidly as the data flows. Key enablers for scalable log-analysis automation include machine-data integration middleware, business rules management systems, semantic analysis, stream computing platforms, and machine-learning algorithms.? Read more

The Data Behind the Internet of Things

5265955179_05c3d1b1a0_o

Nancy Gohring of Computerworld recently wrote, “The market for connected devices like fitness wearables, smart watches and smart glasses, not to mention remote sensing devices that track the health of equipment, is expected to soar in the coming years. By 2020, Gartner expects, 26 billion units will make up the Internet of Things, and that excludes PCs, tablets and smartphones. With so many sensors collecting data about equipment status, environmental conditions and human activities, companies are growing rich with information. The question becomes: What to do with it all? How to process it most effectively and use it in the smartest way possible?” Read more

Where The Money Is: Data Science

datascigrafWhat’s a data scientist worth? Give me a second to identify the relevant data sources, build the machine learning algorithm and create a visualization.

So much for a new take on an old joke. The real answer is about six figures, information I recently came across in a report released earlier this spring: Burtch Works Executive Recruiting survey, Salaries of Data Scientists. The median base salary of data scientist managers is $160,000, it says, while individual contributors average about $120,000. The information comes from 171 data scientists for whom the recruiting firm has complete and current information. Whether a data scientist is at a lower or higher job level, across the board he or she is doing financially better than other Big Data professionals, the report shows.

Read more

Sentiment Mining for Real Time Insights on Twitter

 

syKalev Leetaru of Wired recently wrote, “For its flagship new reality show Opposite Worlds the Syfy channel wanted to let the audience ‘remote control’ the show via social media. I worked with Syfy to create what ultimately became its real-time ‘Twitter Popularity Index.’ The Index combines the intensity of conversation around each character, the number of unique discussants, and the emotion of that discussion using a new sentiment engine powered by over 1.6 million words, phrases and common misspellings and colloquial expressions. Using our Index, Opposite Worlds records across the board in Twitter engagement for a cable television series.” Read more

Big Data Startup Infinite Analytics Maps Your Social Genome

ia

Deepti Chaudhary of Forbes India recently wrote, “Founded in December 2012, Infinite Analytics is a cloud-based big data company that predicts consumer behaviour based on information shared by users on their social networking sites… Infinite Analytics analyses raw data, maps out a person’s social genome and then gives personalised recommendations to consumer brands that have an online presence. This information, which is collected without breaking privacy laws, allows a retailer to identify and recommend products that will appeal to a customer.” Read more

NEXT PAGE >>