Barry Levine of CMS Wire reports, “To help businesses find useful insights in growing amounts of big data, Massachusetts-based OneSource is reinventing search – and changing its name to Avention to reflect its new direction. Jonathan Flatow, Avention’s CEO, told us the new name implies ‘avenue of invention’ — something he believes suits the new search application. Designed for sales, marketing and business researchers, it uses natural language and semantic understanding to conceptually sift through mounds of data sources. Phil McWade, Avention’s Manager of Product Development, told CMSWire, ‘We’re giving our customers what they really want.’ Marketing and sales professionals don’t want ‘a list of news articles’ about companies: They want to identify companies they can sell to.” Read more
Posts Tagged ‘Big Data’
Will deep learning take us where we want to go? It’s one of the questions that Oxford University professor of Computational Linguistics Stephen Pulman will be delving into at this week’s Sentiment Analysis Symposium. There, he’ll be participating in a workshop session today on compositional sentiment analysis and giving a presentation tomorrow on bleeding-edge natural language processing.
“There is a lot of hype about deep learning, but it’s not a magic solution,” says Pulman. “I worry whenever there is hype about some technologies like this that it raises expectations to the point where people are bound to be disappointed.”
That’s not to imply, however, that important progress isn’t taking place when it comes to deep learning, which leverages machine learning methods based on learning representations with applications to everything from NLP to computer vision to speech recognition.
Michael C. Daconta of GCN recently wrote, “Recent articles about Pandora’s and Netflix’s use of big data illustrate why government IT managers should not just focus on data management, data collection and even big data processing. They need to shift the focus from the data producer to the data consumer… In both these cases, we see big data is the stepping stone for consumer-centric information production. The Netflix micro-genres are not the trove of big data on movie viewing, or the movie data itself. Instead, it is useful information mined from that data. Likewise, the data containing Pandora users’ demographics and preferences create a way for advertisers to target buyers.” Read more
Research this month from MindMetre Research shows that 89 percent of organizations believe they need to gain greater insight into their growing volumes of unstructured data to improve their commercial advantages and gain a competitive edge. That insight into such data, the research reports, could feed a number of business-boosting scenarios. “This content can be used to provide insights for proposals and projects, to inform business relationships, to enable collaboration, to avoid repetition of research, to repurpose content, and generally to streamline the flow of enterprise knowledge and avoid replication of work already done,” says Paul Lindsell, Managing Director of MindMetre.
Jennifer Zaino recently wrote an article for our sister website DATAVERSITY on the evolving field of NoSQL databases. Zaino wrote, “Hadoop Hbase. MongoDB. Cassandra. Couchbase. Neo4J. Riak. Those are just a few of the sprawling community of NoSQL databases, a category that originally sprang up in response to the internal needs of companies such as Google, Amazon, Facebook, LinkedIn, Yahoo and more – needs for better scalability, lower latency, greater flexibility, and a better price/performance ratio in an age of Big Data and Cloud computing. They come in many forms, from key-value stores to wide-column stores to data grids and document, graph, and object databases. And as a group – however still informally defined – NoSQL (considered by most to mean ‘not only SQL’) is growing fast. The worldwide NoSQL market is expected to reach $3.4 billion by 2018, growing at a CAGR of 21 percent between last year and 2018, according to Market Research Media. Read more
Jeff Bertolucci of Information Week reports, “Computers do many things faster and more efficiently than the human brain, but they’re decidedly inferior when it comes to extracting meaning from human language. As BigData-Startups.com founder Mark van Rijmenam writes in a recent blog post, the key stumbling block here is that computers understand ‘unambiguous and highly structured’ programming language, while human language is a minefield of nuance, emotion, and implied intent. Van Rijmenam also quotes a Chronicle of Higher Education post by Geoffrey Pullum, a professor of general linguistics at the University of Edinburgh. Pullum outlines three prerequisites for computers to master human language: ‘First, enough syntax to uniquely identify the sentence; second, enough semantics to extract its literal meaning; and third, enough pragmatics to infer the intent behind the utterance, and thus discerning what should be done or assumed given that it was uttered.’ ” Read more
Connect The Dots: Embarcadero Technologies’ Update Integrates Metadata Governance Repository Knowledge With Its Database Tools
Embarcadero Technologies has an update of its database tools – ER/Studio, DBArtisan, Rapid SQL, DB Optimizer and DB Change Manager XE5 — that among its new features includes integration with its Connect metadata governance repository. Connect, which The Semantic Web Blog covered here, keeps all the information about an enterprise’s data — what it means and where it is – to bridge the gap between the work of governance teams and that of day-to-day operations.
“We are providing terrific metadata integration right in the product,” says Henry Olson, director of product management. It is effectively the first instance of collaboration, syndication and integration across ER/Studio, Embarcadero’s data architecture and modeling tools, DB PowerStudio database development, administration, and performance tuning solutions, and Connect. “That’s a deep theme for us because it is a perennial problem in large organizations to make the work of the data architect team more broadly available,” he says, “and to make others more aware of the data assets and better able to use them.”
ElasticSearch 1.0 launches today, combining Elasticsearch realtime search and analytics, Logstash (which helps you take logs and other event data from your systems and store them in a central place), and Kibana (for graphing and analyzing logs) in an end-to-end stack designed to be a complete platform for data interaction. This first major update of the solution that delivers actionable insights in real-time from almost any type of structured and unstructured data source follows on the heels of the release of the commercial monitoring solution Elasticsearch Marvel, which gives users insight into the health of Elasticsearch clusters.
Organizations from Wikimedia to Netflix to Facebook today take advantage of Elasticsearch, which vp of engineering Kevin Kluge says is distinguished by its focus from its open-source start four years ago on realtime search in a distributed fashion. The native JSON and RESTful search tool “has intelligence where when it gets a new field that it hasn’t seen before, it discerns from the content of the field what type of data it is,” he explains. Users can optionally define schemas if they want, or be more freeform and very quickly add new styles of data and still profit from easier management and administration, he says.
Models also exist for using JSON-LD to represent RDF in a manner that can be indexed by Elasticsearch. The BBC World Service Archive prototype, in fact, uses an index based on ElasticSearch and constructed from the RDF data held in a central triple store to make sure its search engine and aggregation pages are quick enough.
PHILADELPHIA, Feb. 4, 2014 /PRNewswire/ — The Intellectual Property & Science business of Thomson Reuters, the world’s leading provider of intelligent information for businesses and professionals, today announced the launch of Cortellis™ Data Fusion, an addition to the Thomson Reuters Cortellis suite, the industry’s most comprehensive information solution for drug discovery and development. Cortellis Data Fusion utilizes linked data technologies – frameworks that allow content to be shared across applications and enterprise or community boundaries – connecting users with data from internal proprietary systems as well as third-party resources to address Big Data challenges. Read more