Hadoop is on almost every enterprise’s radar – even if they’re not yet actively engaged with the platform and its advantages for Big Data efforts. Analyst firm IDC earlier this year said the market for software related to the Hadoop and MapReduce programming frameworks for large-scale data analysis will have a compound annual growth rate of more than sixty percent between 2011 and 2016, rising from $77 million to more than $812 million.
Yet, challenges remain to leveraging all the possibilities of Hadoop, an Apache Software Foundation open source project, especially as it relates to empowering the data scientist. Hadoop is composed of two sub-projects: HDFS, a distributed file system built on a cluster of commodity hardware so that data stored in any node can be shared across all the servers, and the MapReduce framework for processing the data stored in those files.
Semantic technology can help solve many of the challenges, Michael A. Lang Jr., VP, Director of Ontology Engineering Services at Revelytix, Inc., told an audience gathered at the Semantic Technology & Business Conference in New York City yesterday.
Back in March, The Semantic Web Blog wrote an article about FIBO, the Financial Industry Business Ontology that’s on its way to being an Object Management Group series of standards. There, we explored its value as an open semantic standard that can be used by financial institutions and industry regulators, both to support conformance to federal regulatory reporting requirements and for internal business processes and risk analysis.
To continue the discussion about the operational value of FIBO, we recently spoke with key participants developing the standard: David Newman, Strategic Planning Manager, Vice President, Enterprise Architecture, Wells Fargo Bank, who is lead of the industry team collaborating on semantics OTC (over-the-counter) derivatives proof-of-concept, and Mike Atkin, managing director at the Enterprise Data Management (EDM) Council, where FIBO was born and is included as content of EDM’s Semantics Repository.
Last week the New York City Council gave its nod of approval to legislation that would require city agencies to publish public data sets in a common format on an online portal for the public’s use. Mayor Bloomberg just signed off on it, with the Open Data Bill legislation to be phased in over six years.
“There are about 900 data sets in the New York City Open Data Catalogue,” says Ontodia co-founder Joel Natividad. Last year, while at TCG Software Services, he was part of a team that won the Large Organization Recognition Award at BigApps 2.0 – the city-sponsored contest for developers to use NYC Open Data – for participating in creating NYC Data Web, which integrates the NYC.gov data sets into a single web of data for developers. The team also included Revelytix and Spry. “Now that the Open Data Bill just passed, there will be a tsunami of data,” he says.
The 2011 Semantic Technology & Business Conference will land in Washington, DC in less than two weeks. The conference – which will take place at the Kellogg Conference Hotel from November 29-December 1, 2011 – is still making exciting additions to the already impressive roster of speakers and presentations. Recently, we have added top execs from such cutting-edge companies as Revelytix, Orbis, Veda, Phasic Systems, and Systap — they will be on hand to share their insights and reveal the latest developments in the world of Semantic Web Technologies. Read more
Emergent analytics company Revelytix has released early access versions of Spinner and Rex for free evaluation. According to a statement by Revelytix, “Enterprise information management needs radical improvement: it hasn’t really changed much since the relational database was invented. Spinner enables a true, standards-based data federation capability throughout and beyond the enterprise. Spinner employs a new information modeling and description technology, OWL and RDF, from the W3C standards organization. It is now possible to combine enterprise knowledge with enterprise data to enable Emergent Analytics.” Read more
A wide array of products, applications, and other advances in the world of semantic technology were announced at last week’s Semantic Technology Conference. Announcements came from such industry heavy-hitters as Franz, Revelytix, and Expert System. Below are a collection of announcements from event sponsors who used the forum of SemTech to reveal their latest updates. Read more
Ask a group of Semantic Web professionals where the data should live when you’re doing data integration projects – which is just what Cambridge Semantics VP Lee Feigenbaum, acting in his capacity as co-chair of the W3C’s SPARQL Working Group, did at a panel at last week’s SemTech – and don’t expect to get a single, agreed-upon answer.
Among the choices:
“Federation will crush warehousing,” Eric Prud’hommeaux of the W3C and its Semantic Web Health Care and Life Sciences Interest Group said with an eye to provocation. “Leave data where the authorities have it and take advantage of individual domain contributions.” The basic idea of federation is that data stays in its source systems and you do integration dynamically, querying source systems on the fly.
There are great conversations and education taking place at SemTech 2011 San Francisco this week. Tonight, we will feature the “Lightning Sessions,” five minute talks. There are two concurrent tracks: a business-focused track and a technology-focused track. We will be covering both simultaneously. Whether you are here at SemTech or joining us remotely, we hope to see you here tonight at 4:45pm PST.
Maryland-based semantic technology company Revelytix will be presenting a number of their most promising products at the Semantic Technology Conference in a few weeks. One of their sessions will focus on Interactive Visualization Tools for Ontologies: “They say ‘a picture is worth a thousand words’ and it is Revelytix’s belief that the lack of meaningful visual representations of ontologies makes the whole idea of using them harder for the masses to adopt. In our efforts to create a robust interactive ontology visualization tool we have met many challenges… We will demonstrate our current visualization tool and how we have dealt with these issues.” Read more
Back in November, we spoke with Michael Lang Sr. of Revelytix about the collaboration between business and Semantic Web tech-types that should be part of the development of OWL ontologies, if they’re to be of any real value to enterprises. (That story is here.) Over at its knoodl.com site, Revelytix has communities of ontology developers and subject matter experts joining together, matching what the latter set knows about the domain with what the former set knows about creating, managing and analyzing RDF/OWL descriptions.
Now Revelytix has introduced OntVis, a visualization tool to show the semantic contents of OWL DL ontologies for knoodle.com that is another building block for seamless collaboration between ontologists and business subject matter experts. With OntVis, ontologists now “can communicate the precise semantics of any domain to a subject matter expert or business person,” according to a post on the company’s web site. It’s now possible to explore classes and properties in OWL ontologies and the relationships between resources without ever having to see a line of RDF, as the video below explains. Visualizations can be exported to PNG or PDF files, which can be given to SMEs and business people to review and offer comments.
“The problem we are solving for the enterprise (and the semantic web) is the ability to accomplish sophisticated analytics in a federated information environment,” Lang tells The Semantic Web Blog. He goes on to explain that the typical data warehouse or application database of today locates data in the same place as the query engine, and there is a fixed schema. “The relations between concepts are very precise and coded into the DBMS,” he says. “In a federated world, a schema is still required to produce the type of analytics required by the use case; this schema exists as an OWL model. If the OWL file does not represent a domain precisely, then the analytics will not be correct.”
While OWL modelers some day will become experts in the domains being modeled as more businesses take advantage of their expertise, today that is not generally the case. And, if you don’t have subject matter experts participating in ontology makeup, it won’t accomplish the mission. So, in order to get the OWL model of a domain correct, subject matter experts must validate the semantics that are incorporated into the model, Lang says. “These experts cannot read an OWL document directly because it is essentially a foreign language to them,” he says. “They are used to reviewing visualizations of models and commenting on the representation given by the visualization.
OntVis is the first visualization capability that present OWL DL semantics precisely and is readable by subject matter experts and business people.”