Barbara Starr of Search Engine Land recently wrote, “Although there has been some argument within the academic community that the Semantic Web ‘never happened,’ it is blatantly clear that Google has adopted its own version of it. Other search and social engines have as well — I wrote an article back in September 2012 discussing how search and social engines are adopting the Semantic Web and semantic search, and gave a timeline of the adoption of semantic search by both the search and social engines. It was very apparent, even then, that the search engines were moving in the direction of becoming answer engines, and that they were increasingly leveraging the Semantic Web and semantic search technology.” Read more
Posts Tagged ‘schema’
Hadoop is on almost every enterprise’s radar – even if they’re not yet actively engaged with the platform and its advantages for Big Data efforts. Analyst firm IDC earlier this year said the market for software related to the Hadoop and MapReduce programming frameworks for large-scale data analysis will have a compound annual growth rate of more than sixty percent between 2011 and 2016, rising from $77 million to more than $812 million.
Yet, challenges remain to leveraging all the possibilities of Hadoop, an Apache Software Foundation open source project, especially as it relates to empowering the data scientist. Hadoop is composed of two sub-projects: HDFS, a distributed file system built on a cluster of commodity hardware so that data stored in any node can be shared across all the servers, and the MapReduce framework for processing the data stored in those files.
Semantic technology can help solve many of the challenges, Michael A. Lang Jr., VP, Director of Ontology Engineering Services at Revelytix, Inc., told an audience gathered at the Semantic Technology & Business Conference in New York City yesterday.
Alistair Croll recently argued that Big Data is this generation’s civil rights issue. He explains, “In the old, data-is-scarce model, companies had to decide what to collect first, and then collect it. A traditional enterprise data warehouse might have tracked sales of widgets by color, region, and size. This act of deciding what to store and how to store it is called designing the schema, and in many ways, it’s the moment where someone decides what the data is about. It’s the instant of context. That needs repeating: You decide what data is about the moment you define its schema.” Read more
A schema.org blog announcement was posted today by Aaron Brown (Google) and C. Michael Gibson, MD (Wikidoc), announcing “a major set of additions to schema.org that improve our coverage of health and medical topics.” Dan Brickley of schema.org said, “Schema.org gains about 100 classes and 200 properties here, so this is a major addition.”
The announcement points out how this effort is different from existing HCLS vocabulary work as well as previous schema.org work. “Although there are many existing efforts around structured data for health and medicine, such structure is today typically available only ‘behind the scenes’ rather than shared in the Web using standard markup. Our design goals therefore differed from many previous initiatives, in that we focused on markup for use by Webmasters and publishers. Our main goal was to create markup that will help patients, physicians, and generally health-interested consumers find relevant health information via search.”
Barbara Starr has written an article about how retailers can benefit from Google’s Knowledge Graph. She writes, “After Google’s Metaweb acquistion, the search engines were all becoming, in baby steps, what I would call ‘answer engines.’ Typing in a query such as ‘Barack Obama birthday’ would yield an answer. I tried it again recently and the result was amazing! Placing semantic markup on your webpages makes them more findable. For shopping sites, the markup information can leveraged so users quickly identify sites that have only the relevant products they seek. Some examples of rich snippets are shown below.”
She continues, “Google is now leveraging linked data within the enterprise, which is very clear within the ‘Knowledge Graph,’ and in some fashion, they are doing so with the consumed information from rich snippets, namely retail, reviews, etc. In this case we specifically refer to the retail aspect and its associated domains/schemas. Read more
The schema.org official blog has announced support for enumerated lists. Adding this support allows developers using schema.org to use selected externally maintained vocabularies in their schema.org markup. According to the W3C-hosted schema.org WebSchemas wiki, “This is in addition to the existing extension mechanisms we support, and the general ability to include whatever markup you like in your pages. The focus here is on external vocabularies which can be thought of as ‘supported’ (or anticipated) in some sense by schema.org.”
In other words, “Schema.org markup uses links into well-known authority lists to clarify which particular instance of a schema.org type (eg. Country) is being mentioned.”
On Tuesday the E&P Information Management Association (EPIM) launched EPIM ReportingHub (ERH), an interesting semantic technology project in the field of oil and gas. According to the project website, ERH is “a very flexible knowledgebase for receiving, validating (using NPD’s Fact Pages and PCA RDL), storing, analysing, and transmitting reports. The operators shall send XML schemas for DDR, DPR and MPR to ERH and ERH sends DDR and MPR as XML schemas to the NPD/PSA and all three reports as PDF to EPIM’s License2Share (L2S). The partners may download all three reports and/or any data from one or more reports through flexible queries. Some parts of ERH will be in operation already in November 2011 and the rest as soon as the authorities and the industry are ready for it. ERH is owned and operated by EPIM.” Read more
George Thomas recently wrote about the exciting advances in adding clinical quality linked data to Health.Data.gov. Thomas also presented on this topic at last week’s Semantic Technology Conference. Thomas writes, “In addition to making flatfiles available to download on the Web, and providing applications that enable programmatic access to backend databases through the Web, imagine using the Web itself as a database: a massively distributed, decentralized database. This is what Linked Data is about – putting data in the Web. As part of our ongoing collaboration to democratize open government data with Data.gov, the Centers for Medicare and Medicaid Services are now publishing Clinical Quality Linked Data on Health.data.gov, beginning with Hospital Compare.” Read more
August Jackson recently wrote about the difficulties of explaining the concept of ontology: “As I’ve tested elevator pitches it’s become clear to me that the word “ontology” is the kiss of death with a few audiences. Friends and family without a background in IT immediately assume it has something to do with being a cancer doctor. Often those who are familiar with ontology have been burned by academic exercises and are very sceptical that ontology has anything practical to offer.”
Jackson continues with a simple definition, “Ontology, at its most fundamental level, enables a computer to ‘understand’ complex concepts in a way similar to how humans understand those concepts. This makes it possible for computers to process complex information such as that contained in text much the way we are familiar with computers processing spreadsheets and databases full of numerical data. Read more