Chloe Green of Information Age recently wrote, “Handling immense data sets requires a combination of scientific and technological skills to determine how data is stored, searched and accessed. In science, the importance of data scientists in ensuring that data is handled correctly from the outset is not underestimated; other industries can learn from the scientific approach. Text-mining tools and the use of relevant taxonomies are essential. If we think about big data as a huge number of data points in some multi-dimensional space, the problem is one of analysis, i.e. frequently finding very similar or very dissimilar points which cannot be compared. In life sciences, taxonomies assign data points a class, thus comparison of two points is as easy as looking up other data points in the same class.” Read more
Posts Tagged ‘Text Mining’
Comprehensive support for semantic analysis across 20 languages (up from ten) is one of the latest additions to TextRazor’s customizable, open semantic analysis and text mining API, to satisfy what the startup says is increasing demand for sophisticated semantic tools that go beyond English.
The company’s technology has been in public beta for just a few months. It differs from other multilingual natural language processing solutions, says founder Toby Crayston, in that it strongly leverages linked data sources like DBpedia and the semantic web to disambiguate, normalize and filter extracted metadata with better accuracy, so that end users can build powerful multilingual classifiers regardless of the language of their documents.
In the United States, the app economy, as of late 2012, had created close to 530,000 jobs and served as a significant economic driver for a number of states. A study released by CTIA-The Wireless Association and the Application Developers Alliance, dubbed The Geography of the App Economy, reported more than 2.4 million apps available on more than 11 different operating systems and the stat that by 2016, mobile app revenue would be more than $46 billion.
Europe wants in. No wonder, when you see stats like the one from ABI Research this year that point to the combined app revenue from tablets and smartphones being projected to reach $92 billion by 2018, and to the app economy growing at 44.6 percent on average annually. But the continent needs some data to help it get its spot in the limelight, which is where Eurapp comes in.
The newly launched venture, Eurapp, was birthed by the European Commission, and is being run by the Digital Enterprise Research Institute at NUI Galway in conjunction with tech industry analyst firm GigaOM Research. It’s part of the Startup Europe initiative of the European Commission’s Digital Agenda, which aims to help tech entrepreneurs start, maintain and grow their businesses in Europe. NUI Galway’s Dr John Breslin, SIOC creator and co-founder of iPad news and social reader app StreamGlider (see our story here) is leading the Eurapp project at DERI.
Appinions, an opinion based marketing platform reports that the company has been awarded a patent “for their groundbreaking methodology for analyzing, extracting and summarizing opinions from digital text, such as articles, blog posts and social media messages. The US Patent Office awarded the patent to Appinions for the system and method for automatically summarizing fine-grained opinions in digital text. The patent covers the methodology Appinions uses to analyze text from more than 5 million online sources to extract opinions, identify the person or source of a particular opinion and determine the topic and sentiment of that opinion. The patent is effective for 20 years and precludes other companies from implementing the methodology and business processes developed by Appinions.” Read more
If you would like your company to be considered for an interview please email editor[ at ]semanticweb[ dot ]com.
In this segment of our “Innovation Spotlight” we spoke with Elliot Turner (@eturner303), the founder and CEO of AlchemyAPI.com. AlchemyAPI’s cloud-based platform processes around 2.5 billion requests per month. Elliot describes how their API helps companies with sentiment analysis, entity extraction, linked data, text mining, and keyword extraction.
Sean: Hi Elliot, thanks for joining us, how did AlchemyAPI get started?
Elliot: AlchemyAPI was founded in 2005 and in the past seven years has become one of the most widely used semantic analysis APIs, processing billions of transactions monthly for customers across dozens of countries.
I am the Founder and CEO and a serial entrepreneur who comes from the information security space. My previous company built and sold high-speed network security appliances. After it was acquired, I started AlchemyAPI to focus on the problem of understanding natural human language and written communications.
Sean: Can you describe how your API works? What does it allow your customers to accomplish?
Elliot: Customers submit content via a cloud-based API, and AlchemyAPI analyzes that information in real-time, transforming opaque blobs of text into structured data that can be used to drive a number of business functions. The service is capable of processing thousands of customer transactions every second, enabling our customers to perform large-scale text analysis and content analytics without significant capital investment.
Wikimeta Project’s Evolution Includes Commercial Ambitions and Focus On Text-Mining, Semantic Annotation Robustness
Wikimeta, the semantic tagging and annotation architecture for incorporating semantic knowledge within documents, websites, content management systems, blogs and applications, this month is incorporating itself as a company called Wikimeta Technologies. Wikimeta, which has a heritage linked with the NLGbAse project, last year was provided as its own web service.
Dr. Eric Charton, Ph.D, MSc at École Polytechnique de Montréal, is project leader and author of the Wikimeta code. The NLGbAse project was conducted by Charton at the University of Avignon as part of his Ph.D. Thesis. The Semantic Web Blog recently hosted an email discussion with him to learn more about the Wikimeta architecture and its evolution.
The Semantic Web Blog: Tell us about the NLGBase project and Wikimeta’s relationship to it.
Charton: NLGbAse is an ontology extracted from Wikipedia. It is used in Wikimeta as a resource for semantic disambiguation. For each Wikipedia document (aka Semantic Concept), NLGbAse provides various ways of word-writing (for example, “General Motors” can be written “GM Company”, “GM”, “General Motors Corp” and so on), used for detection.
Selventa and Linguamatics have announced a strategic partnership to provide a “complete pipeline of scientific knowledge extraction to their life science partners.” According to the article, “The alliance will bring together established analytical capabilities of both companies to efficiently extract complex life science knowledge in a computable, structured, biological expression language (BEL) format that can be used to interpret large-scale experimental data in the context of published literature.” Read more
Since Nstein was acquired by OpenText a little over a year ago, work has been underway to build the former’s semantic technology for text mining and analytics and search into the latter’s enterprise content management platform. So far, that’s resulted in adding Semantic Navigation, the on-premise or cloud web site search and content discovery solution, to OpenText’s Web content management (WCM) products, such as OpenText Web Experience Management and Web Site Management.
This covers aspects such as content tagging and semantic faceting at the content and document levels. This year and the following should see further integration of Nstein technologies into the OpenText solutions set, as well as some new offerings emerging to support other use cases.
As an example, the company is working on a listening platform application, drawing on work Nstein had done for the Canadian government’s public health agency that used its Text Mining Engine to identify potential threats to human health by scouring multiple sources — including news aggregators like Factiva – that were parsed for about 1,000 or so concepts such as “mysterious ailments” and “outbreak.” It’s building up a framework for ingesting different data sources to support this, says Charles-Olivier Simard, product manager for semantic technologies at OpenText.
A recent article highlights Syllabs, “a web service built to apply semantic analysis to website text in a few extremely useful ways. The Syllabs API can detect the language of text, detect which ‘named entities’ a block of text mentions (people, businesses) and find related keywords to any keyword. When you hear about the semantic web, this is the sort of API driving that future.” Read more
NEXT PAGE >>