Posts Tagged ‘triple store’

Semantic Web Developer Challenge: We Have A Winner And Runners-Up Too

xsblogoThe results of the Semantic Web Developer Challenge, co-sponsored by XSB and SemanticWeb.com and launched at this summer’s Semantic Technology and Business Conference, are in: The $5,000 prizewinner was a team of two, Greg Varga and Siraj Bawa,  from Vanderbilt University. There were two runners-up: One was a team from Stony Brook University, comprised of Mrinal.Priyadarshi, Anurag Choudhary, and Paul Fodor, and the other was Roman Sova from consulting firm Good Monster.

The aim of the Challenge was to build sourcing and product life cycle management applications leveraging XSB’s PartLink Data Model, which the company developed as a project for the Department of Defense Rapid Innovation Fund. The model uses semantic web technology to create a coherent Linked Data model for all part information in the Department of Defense’s supply chain – which includes about 40 million component parts, their manufacturers and suppliers, materials, technical characteristics and more.

The large collection of engineering product information has potential beyond DoD use alone. “The current size of the Part Link triple store is well over a billion triples — maybe 1.3 billion,” says Alberto Cassola, vp sales and marketing at XSB. “For the industrial sector it may very well be one of the largest efforts of its kind.”

Read more

Tamr On Mission To Curate And Connect Data

tamrlogoMake it as easy to add and connect new data sources into the enterprise analytics infrastructure as it is to add a new web site onto the modern web. That’s where next-gen data curation company Tamr, a startup born from an MIT research project to bring together lots of tabular data sources in a scalable and repeatable way.

Just like Google does all the work to find and connect web sites hosting the information that users want, “we want to do the same with tabular data sources inside the enterprise,” says Tamr co-founder and CEO Andy Palmer. “Tamr provides systems of reference. If you are looking for attributes to add to an analysis or want data to support something, you have this reference place to go in the enterprise with a catalogue of all the data that exists across the company.”

So often businesses want to use analytics to address hard questions, but can’t do so successfully unless they are integrating lots of disparate data sources and creating a referential catalog. With Tamr, Palmer says, they can ingest data sources very quickly into a semantic triple store, make them available in real time, and connect them using machine learning to map attributes and match records, in support of providing a unified view of a given entity that can then be consumed by various business intelligence and analytics tools. To be useable, he points out, data has to be “very, very thoroughly connected into everything else for there to be context and reference for how it can be consumed and whether it is reliable.”

Read more

Stardog 2.1 Hits Scalability Breakthrough

Stardog LogoWashington, DC – January 21, 2014 – The new release (2.1) of Stardog, a leading RDF database, hits new scalability heights with a 50-fold increase over previous versions. Using commodity server hardware at the $10,000 price point, Stardog can manage, query, search, and reason over datasets as large as 50B RDF triples.

The new scalability increases put Stardog into contention for the largest semantic technology, linked data, and other graph data enterprise projects. Stardog’s unique feature set, including reasoning and integrity constraint validation, at large scale means it will increasingly serve as the basis for complex software projects.

“We’re really happy about the new scalability of Stardog,” says Mike Grove, Clark & Parsia’s Chief Software Architect, “which makes us competitive with a handful of top graph database systems. And our feature set is unmatched by any of them.”

The new scalability work required software engineering to remove garbage collection pauses during query evaluation, which the 2.1 release also accomplishes. Along with a new hot backup capability, Stardog is more mature and production-capable than ever before.

Read more

MarkLogic 7 Vision: World-Class Triple Store and World-Beating Information Store

Photo courtesy: Flickr/rvaphotodude

Last month at its MarkLogic World 2013 conference, the enterprise NoSQL database platform provider talked semantics as it related to its MarkLogic Server technology that ingests, manages and searches structured, semi-structured, and unstructured data (see our story here). The vendor late last week was scheduled to provide an early access release of MarkLogic 7, formally due by year’s end, to some dozens of initial users.

“People see a convergence of search and semantics,” Stephen Buxton, Director, Product Management, recently told The Semantic Web Blog. To that end, a lot of the vendor’s customers have deployed MarkLogic technology as well as specialized triple stores, but what they really want, he says, is an integrated approach, “a single database that does both individually and both together,” he says. “We see the future of search as semantics and the future of semantics as search, and they are very much converging.” At its recent conference, Buxton says the company demonstrated a MarkLogic app it built to function like Google’s Knowledge Graph to provide an idea of the kinds of things the enterprise might do with both search and semantics together.

Following up on the comments made by MarkLogic CEO Gary Bloom at his keynote address at the conference, Buxton explained that, “the function in MarkLogic we are working on in engineering is a way to store and manage triples in the MarkLogic database natively, right alongside structured and unstructured information – a specialized triples index so queries are very fast, and so you can do SPARQL queries in MarkLogic. So, with MarkLogic 7 we will have a world-class triple store and world-beating information store – no one else does documents, values and triples in combination the way MarkLogic 7 will.”

Read more

NoSQL Database Platform Vendor MarkLogic Gets $25 Million, Promises To Go Deep On Semantics

Enterprise NoSQL database platform provider MarkLogic has come into some cash: a $25 million round of growth capital from investors including Sequoia Capital, Tenaya Capital, Northgate Capital, CEO Gary Bloom and other corporate executives. Yesterday, at the company’s MarkLogic World 2013 conference, Bloom also prepared the audience to hear more today from company executives about MarkLogic’s next steps in semantics for its MarkLogic Server technology that ingests, manages and searches structured, semi-structured, and unstructured data.

“The way to think about this is that when we look at semantics, we didn’t … say we just want to check a box on semantics,” Bloom said, by working with partners on some low-hanging fruit – although it will be collaborating with them on various semantic enrichment capabilities. “We think semantics is critical technology, and more interesting I believe is that it is a critical technology that is both a search technology as well as a database technology.” Others in the marketplace will focus on changing their search engines to do semantics, but optimum results won’t come if all that’s being done is layering in semantics at the search level, he said.

Read more

Graphs Make The World Of Data Go Round

“We want to help the world make sense of data and we think graphs are the best way of doing that.”

That’s the word from Emil Eifrem, CEO of Neo Technology, which makes the open-source Neo4j NoSQL graph database. He’s not talking in terms of RDF-centric solutions, even though he says he’s 100 percent in agreement with the vision of the semantic web and machine readability. “The world is a graph,” Eifrem says, “and RDF is a great way of connecting things. I’m all in agreement there.” The problem, in his opinion, is that execution on the software end there has been lacking.

“This comes down to usability,” he says, and the average developer, he believes, finds the semantic web-oriented tools largely incomprehensible. Eifrem says he’s speaking from real-world experiences, having worked directly with RDF and taught classes on the semantic web layers. Where it took a week to get students up to speed on things like Jena and Sesame, they ‘get’ the property graph and graph databases in half-a-day, he says. Neo4j stores data in nodes connected by directed, typed relationships with properties on both – also known as a property graph.

Read more

Pfizer Moves Semantic Tech Forward, Helping Business Respond To Cost Pressures And Realize Efficiency Gains

A couple of years back, The Semantic Web Blog visited with Vijay Bulusu to gain some insight into how pharma giant Pfizer Inc. was moving forward with semantic technology (see article here). At last week’s Semantic Technology and Business Conference in New York City, Bulusu, director, informatics and innovation at Pfizer, provided additional perspective on the issue – first, during the presentation on Using Linked Semantic Data in Biomedical Research and Pharmaceuticals (see coverage of that here), and then in a follow-up conversation.

A struggle for pharma companies, Bulusu notes, sits in driving standards for data that exists across system silos, so it is broadly applicable across groups. A transaction like creating a batch of materials, doing analytical testing on it and enabling clinical trial releases is the work of multiple groups of people in departments like R&D entering data across different systems.

The foundational layer needed to support data aggregation in a persistent graph semantic database and visualization with collaborative, semantic knowledge maps “is all about data already in transactional, silo’d systems,” Bulusu says. “We want to make sure that across those systems, key data is entered consistently for entities.” That means limiting them to selecting via a drop-down list from a vocabulary that is consistently managed and published from a single source to all these transaction systems, so the same entity is called by the same name as it traverses systems to support analytics and other requirements. That, he says, “is where we directly impact the day-to-day operational work of users.”

Read more

Get In On CrowdSourcing An Open Knowledge Graph API

Last week The Semantic Web Blog continued its coverage of Google’s Knowledge Graph with the news of its worldwide launch for English-language users. This week we’ve learned about a paper submitted to the 1st International Workshop on Knowledge Extraction and Consolidation from Social Media (KECSM2012), which takes place in November in Boston, about work underway on the topic of crowd-sourcing an Open Knowledge Graph API.

The paper, authored by Thomas Steiner of the Universitat Politècnica de Catalunya in Barcelona and Stefan Mirea of Jacobs University, Bremen, Germany, and currently pending review, proposes that the crowd step in where Google has so far failed to tread when it comes to creating an interface to the Knowledge Graph of more than 500 million objects – landmarks, celebrities, cities, sports teams, buildings, movies, celestial objects, works of art, and more – and 3.5 billion facts about and relationships between them. There is no publicly available list of all those objects, and, say the authors, even if there were, “it would not be practicable (nor allowed by the terms and conditions of Google) to crawl it.” Hence, the crowd-sourcing approach.

Read more

DERI and Fujitsu Team On Research Program

The Digital Enterprise Research Institute (DERI) is kicking off a project with Fujitsu Laboratories Ltd. in Japan to build a large-scale RDF store in the cloud capable of processing hundreds of billions of triples. The idea, says DERI research fellow Dr. Michael Hausenblas, “is to build up a platform that allows you to process and convert any kind of data” — from relational databases to LDAP record-based, directory-like data, but also streaming sources of data, such as sensors and even the Twitter firehose.

The project has defined eight different potential enterprise use cases for such a platform, ranging from knowledge-sharing in health care and life science to dashboards in financial services informed by XBRL data. “Once the platform is there we will implement at least a couple of these use cases on business requirements, and essentially we are going to see which are the most promising for business units,” Hausenblas says.

Read more

Linked Open Data In Action In World War I Showcase Project

A fascinating project has been undertaken by the Partners of the Pan-Canadian Documentary Heritage Network (PCDHN): It’s a proof-of-concept showcase of using Linked Open Data visualizations for “Out of the Trenches.” This is a look at the First World War from the Canadian perspective: war songs, postcards, newspapers, photos, films, and these resources’ intersection with Canadian soldiers who fought in the war.

 

These digital resources from organizations such as McGill University, the Universities of Alberta, Calgary and Saskatchewan, and the Bibliothèque et Archives nationales du Québec have been linked through existing metadata provided in formats ranging from spreadsheets to MODS XML to RDF. Rather than reduce the metadata to a common subset, the approach was to maximize its use by moving to the “web of data” concept, so that the resources can be combined in different and unexpected ways, according to the proof-of-concept final report that was issued on the project.

The premise was to expose the metadata for these resources using RDF XML and existing published ontologies such as the Event Ontology, the Dublin Core Ontology and the Biographical Ontology, elements sets, vocabularies and resources like the Geonames geographical database to maximize discovery by the user community and contribute to the Semantic Web.

Read more

NEXT PAGE >>