Semantic Web Meets BI In New Project Whose Partners Include SAP, Sheffield Hallam University, Ontotext

cubist.PNG

SAP, which we’ve labeled one of the gorillas in the semantic web space, not surprisingly is involved in a lot of research work in Europe relating to this realm, including the Monnet (Multilingual Ontologies for Networked Knowledge) Project. One of the legs of that project has to do with cross-lingual business intelligence – using semantic technology to support search, query and information extraction of XBRL-based financial reports in a user’s native language, regardless of what language those reports are filed in.

News came this week about another effort that the software giant is coordinating in the semantic web-BI space. As part of a £4 million collaborative project for which SAP is the managing partner — dubbed Combining and Uniting Business Intelligence With Semantic Technologies (CUBIST) — the U.K.’s Sheffield Hallam University was awarded nearly £400,000 from the European Commission’s 7th Framework Programme to create new visual tools to help businesses make sense of tons of data.


CUBIST in a nutshell is about developing an approach for semantic and easily understandable business intelligence by augmenting semantic technologies with BI capabilities, and providing responsive and intuitive visual analytics, says Dr. Simon Andrews, one of the two academics leading the research at the University. “CUBIST aims to use a semantic technology called Formal Concept Analysis (FCA),” he says, with which Sheffield Hallam has expertise. The university will work with the data warehousing/RDF triple-store experts in the project consortium in preparing data for FCA, and with the visualization experts in the consortium in developing the FCA-based visual analytics, he says.

FCA, to explain it further, is a way of constructing a hierarchy of data, and is emerging as a data analysis technology for business intelligence, Andrews says. A key element of FCA is a visualization called the concept lattice, which portrays relational attribute/object data as a hierarchy of related groupings called Formal Concepts. The basis for FCA is a simple cross-table called a Formal Context that describes the relationships between objects and attributes. “Our aim is to allow the end business user to interact with the concept lattice and other elements of a GUI to perform semantic analyses of their data and to mine their data for hidden meaning,” Andrews says.

fcatable.jpg The table here shows a Formal Context representing destinations of five airlines where the elements on the left are formal objects and the elements at the top are formal attributes. In a document describing FCA it is explained that if an object has a specific attribute, it is indicated by placing a cross in the corresponding cell of the table. An empty cell indicates that the corresponding object does not have the corresponding attribute. For example, Air Canada performs flights to Latin America but does not perform flights to Africa. Formal Concepts are maximal rectangles of crosses in the table — Asia Pacific, for instance, is flown to by all the airlines and there are no other destinations flown to by all the airlines, so the column of crosses under Asia Pacific is a maximal rectangle. If USA is added, airline Ansett Australia is lost, and note that Europe is also flown to by the remaining four airlines. So adding Europe makes this rectangle of crosses maximal. A hierarchy of Formal Concepts becomes apparent and can be visualized as a lattice.

“Existing BI systems are poor at federating data from unstructured and structured sources and at extracting explicit meaning of data and explicit relations and links in data,” Andrews says. “CUBIST aims to address these problems by the use of semantic technologies which are better at this. End business users can find new and hidden meaning in their disparate data sources, and the concept lattice provides a new, conceptual view of their data.” Expected impacts are in three different areas of business intelligence – very large databases, bringing semantic enrichments to an industrial level, and visual analytics. “These are serious impacts at an international level,” Andrews says.

SAP provides the bulk of the project management and assumes an R&D role in this effort. Ontotext is tasked with providing the data warehousing/triple-store expertise and innovations in federating data from large-scale structured and unstructured sources, including the semantic web, wikis, and blogs. Centrale Rechereche S.A. is providing visualization and visual analytics expertise. Andrews explains that CUBIST also has three use-case partners providing large-scale data and analysis scenarios: The U.K.’s Heriot-Watt University, with data from the Edinburgh Mouse Atlas Project; the Space Applications Services (Belgium), with data from its space-satellite services systems; and the U.K.’s Innovantage, with web-based market intelligence data.

The project milestones include creating requirements and mock-ups by month 6; completing architecture and implementation plans by month 12; deploying the first integrated prototype system and verifying core technologies by month 21; and evaluating and deploying the final integrated system at the three-year mark.

There was strong competition for the funding, Andrews says, with only the top eigh evaluated bids being funded from several hundred applications for the call for projects in Intelligent Information Management. The bid was a joint effort of all of the partners but successful elements that apply strongly to Sheffield Hallam, he says, included high-quality objectives and a very convincing concept that builds on well-defined and established technology, where the partners have very strong expertise.

“The proposal represents progress beyond the state-of-the-art in developing a semantic incorporated business intelligence platform dealing with large amount of data and offering interactive visualization,” he says. “The goal of the project is ambitious: it will develop the first framework for enriching Business Intelligence with Semantic Web technologies.”

• Don’t forget to propose your startup for our Semantic Web Impact Awards. The deadline is Sept. 15.

RELATED:

    None

Semantic Tech & Business Conference Returns to San Francisco

Semantic Tech & Business Conference returns to San Francisco in June! Join us from June 3-7 for complete coverage of Big Data, Linked Data, Extreme Information Management, and Semantic Web. From breakthrough approaches to solving business problems to the big data implications of fast–evolving technologies, SemTechBiz provides you with an unparalleled interactive experience and delivers tangible business value. We're offering a special early rate when you register by February 17. Sign up now!