Posts Tagged ‘Elasticsearch’

Prelert’s Elasticsearch Equipped with Anomaly Detection

Prelert logoDaniel Gutierrez reported, “Prelert, the anomaly detection company, today announced the release of an Elasticsearch Connector to help developers quickly and easily deploy its machine learning-based Anomaly Detective® engine on their Elasticsearch ELK (Elasticsearch, Logstash, Kibana) stack. Earlier this year, Prelert released its Engine API enabling developers and power users to leverage its advanced analytics algorithms in their operations monitoring and security architectures. By offering an Elasticsearch Connector, the company further strengthens its commitment to democratizing the use of machine learning technology, providing tools that make it even easier to identify threats and opportunities hidden within massive data sets. Written in Python, the Prelert Elasticsearch Connector source is available on GitHub. This enables developers to apply Prelert’s advanced, machine learning-based analytics to fit the big data needs within their unique environment.”

The article continues with, “Prelert’s Anomaly Detective processes huge volumes of streaming data, automatically learns normal behavior patterns represented by the data and identifies and cross-correlates any anomalies. It routinely processes millions of data points in real-time and identifies performance, security and operational anomalies so they can be acted on before they impact business. The Elasticsearch Connector is the first connector to be officially released by Prelert. Additional connectors to several of the most popular technologies used with big data will be released throughout the coming months.”

Read more here.

Image courtesy Prelert.

What’s The Word On Enterprise Search?

Photo Credit: Sean MacEntee/ Flickr

Photo Credit: Sean MacEntee/ Flickr

Context is king – at least when it comes to enterprise search. “Organizations are no longer satisfied with a list of search results — they want the single best result,” wrote Gartner in its latest Magic Quadrant for Enterprise Search report, released in mid-July. The report also says that the research firm estimates the enterprise search market to reach $2.6 billion in 2017.

The leaders list this time around includes Google with its Search Appliance, which Google touts as benefitting from Google.com’s continually evolving technology, thanks to machine learning from billions of search queries. Also on that part of the quadrant is HP Autonomy, which Gartner says is “exceptionally good at handling searches driven by queries that include surmised or contextual information;”  and Coveo and Perceptive Software, both of which are quoted as offering “considerable flexibility for the design of conversational search capabilities, to reduce the ambiguity of results.”

Read more

Daedalus Takes Meaning-As-A-Service To Excel, GATE And CMS Systems

meaningasaserviceDaedalus (which The Semantic Web Blog originally covered here) has just made its Textalytics meaning-as-a-service APIs available for Excel and GATE (General Architecture for Text Engineering), a JAVA suite of tools used for natural language processing tasks, including information extraction in many languages. Connecting its semantic analysis tools with these systems is one step in a larger plan to extend its integration capabilities with more API plug-ins.

“For us, integration options are a way to lower barriers to adoption and to foster the development of an ecosystem around Textalytics,” says Antonio Matarranz, who leads marketing and sales for Daedalus. The three main ecosystem scenarios, he says, include personal productivity tools, of which the Excel add-in is an example, and NLP environments, of which GATE is an example. “But UIMA (Unstructured Information Management Applications) is also a target,” he says. The list also is slated to include content management systems and search engines, among them open source systems like WordPress, Drupal, and Elasticsearch.

Read more

SIREn Schemaless Structured Doc Search System Zips Through Complex Nested Document Search

sirenSchemaless structured document search system SIREn (Semantic Information Retrieval ENgine) has posted some impressive benchmarks for a demonstration it did of its prowess in searching complex nested documents. A blog here discusses the test, which indexed a collection of about 44,000 U.S. patent grant documents, with an average of 1,822 nested objects per doc, comparing Lucene’s Blockjoin capability to SIREn.

The finding for the test dataset: “Blockjoin required 3,077MB to create facets over the three chosen fields and had a query time of 90.96ms. SIREn on the other hand required just 126 MB with a query time of 8.36ms. Blockjoin required 2442% more memory while being 10.88 times slower!”

SIREn, which was launched into its own website and community as part of SindiceTech’s relaunch (see our story here), attributes the results to its use of a fundamentally different conceptual model from the Blockjoin approach. In-depth tech details of the test are discussed here. There it also is explained that while the focus of the document is Lucene/Solr, the results are identically applicable to ElasticSearch which, under the hood, uses Lucene’s Blockjoin to support nested documents.

The Semantic Web Blog also checked in with SindiceTech CEO Giovanni Tummarello to get a further read on how SIREn has evolved since the relaunch to enable such results, and in other respects.

Read more

Additional Funding For Elasticsearch To Help Company Complement Its RealTime Search And Analytics Stack

elasticsearchlogoElasticsearch – whose Elasticsearch, Logstash and Kibana products for discovering and extracting insights from structured and unstructured data were discussed earlier this year here – has raised $70 million in Series C financing from New Enterprise Associates (NEA). Benchmark Capital and Index Ventures also participated in the round. That brings the total to $104 million over the past 18 months.

“Nearly all companies, start-ups and Fortune 500 enterprises alike, need to be able to slice and dice rapidly expanding data volumes in real time,” says Steven Schuurman, co-founder and CEO. The funding, Schuurman says, will be applied to enhancing sales, marketing and support personnel and efforts, as well as investing in development to build more complementary products that work with the ELK stack.

“Ultimately, this round of funding will help us get to our goal, faster, of making the ELK stack the de facto platform for businesses to gain actionable insights from their data,” he says.

Read more

Declara Individualizes Large-Scale Learning

coggraphLearning at large-scale. That’s the work Declara is undertaking with its CognitiveGraph platform that leverages semantic search, social platforms and predictive analytics to build context-specific learning pathways for the individuals involved in mass learning efforts. Think, for example, of teachers in a country working to re-educate all its educators, or retail and manufacturing workers in parts of the world who need new skill sets because machines have taken on the work these people used to do.

Adults don’t have the luxury of just being focused on learning, so “we try to help them learn more effectively and quickly, using the CognitiveGraph as a way of knowing where to start from and how to get them to positive outcomes faster,” says co-founder and CEO Ramona Pierson. Its intelligent learning platform will determine what mentors and information exist within a closed private network or on the Web relative to supporting a user’s learning needs; what of all that will be the best fit for a particular user; and then match that learner to the best pathway to acquire the new skills. Among the technologies Declara is leveraging is Elasticsearch (which the Semantic Web Blog discussed most recently here) realtime search and analytics capabilities to turn data into insights.

Read more

Elasticsearch 1.0 Takes Realtime Search To The Next Level

esearchpixElasticSearch 1.0 launches today, combining Elasticsearch realtime search and analytics, Logstash (which helps you take logs and other event data from your systems and store them in a central place), and Kibana (for graphing and analyzing logs) in an end-to-end stack designed to be a complete platform for data interaction. This first major update of the solution that delivers actionable insights in real-time from almost any type of structured and unstructured data source follows on the heels of the release of the commercial monitoring solution Elasticsearch Marvel, which gives users insight into the health of Elasticsearch clusters.

Organizations from Wikimedia to Netflix to Facebook today take advantage of Elasticsearch, which vp of engineering Kevin Kluge says is distinguished by its focus from its open-source start four years ago on realtime search in a distributed fashion. The native JSON and RESTful search tool “has intelligence where when it gets a new field that it hasn’t seen before, it discerns from the content of the field what type of data it is,” he explains. Users can optionally define schemas if they want, or be more freeform and very quickly add new styles of data and still profit from easier management and administration, he says.

Models also exist for using JSON-LD to represent RDF in a manner that can be indexed by Elasticsearch. The BBC World Service Archive prototype, in fact, uses an index based on ElasticSearch and constructed from the RDF data held in a central triple store to make sure its search engine and aggregation pages are quick enough.

Read more