Posts Tagged ‘RDF’
Jennifer Zaino recently wrote an article for our sister website DATAVERSITY on the evolving field of NoSQL databases. Zaino wrote, “Hadoop Hbase. MongoDB. Cassandra. Couchbase. Neo4J. Riak. Those are just a few of the sprawling community of NoSQL databases, a category that originally sprang up in response to the internal needs of companies such as Google, Amazon, Facebook, LinkedIn, Yahoo and more – needs for better scalability, lower latency, greater flexibility, and a better price/performance ratio in an age of Big Data and Cloud computing. They come in many forms, from key-value stores to wide-column stores to data grids and document, graph, and object databases. And as a group – however still informally defined – NoSQL (considered by most to mean ‘not only SQL’) is growing fast. The worldwide NoSQL market is expected to reach $3.4 billion by 2018, growing at a CAGR of 21 percent between last year and 2018, according to Market Research Media. Read more
Last week news came from SindiceTech about the availability of its SindiceTech Freebase Distribution for the cloud (see our story here). SindiceTech has finalized its separation from the university setting in which it incubated, the former DERI institute, now a part of the Insight Center for Data Analytics, and now is re-launching its activities, with more new solutions and capabilities on the way.
“The first thing was to launch the Knowledge Graph distribution in the cloud,” says CEO Giovanni Tummarello. “The Freebase distribution showcases how it is possible to quickly have a really large Knowledge Graph in one’s own private cloud space.” The distribution comes instrumented with some of the tools SindiceTech has developed to help users both understand and make use of the data, he says, noting that “the idea of the Knowledge Graph is to have a data integration space that makes it very simple to add new information, but all that power is at risk of being lost without the tools to understand what is in the Knowledge Graph.”
Included in the first round of the distribution’s tools for composing queries and understanding the data as a whole are the Data Types Explorer (in both tabular and graph versions), and the Assisted SPARQL Query Editor. The next releases will increase the number of tools and provide updated data. “Among the tools expected is an advanced Knowledge Graph entity search system based on our newly released SIREn search system,” he says.
With the support of Google Developers, SindiceTech has announced the availability of its Freebase Distribution for the cloud. According to SindiceTech, “Freebase is an amazing data resource at the core of Google’s ‘Knowledge Graph’. Freebase data is available for full download but today, using it ‘as a whole’ is all but simple. The SindiceTech Freebase distribution solves that by providing all the Freebase knowledge preloaded in an RDF specific database (also called triplestore) and equipped with a set of tools that make it much easier to compose queries and understand the data as a whole.”
Your Own Private Freebase
Tagged is looking for a Big Data Engineer. According to the post, “Technology is at the point where ubiquitous devices mitigate the problem of physical distance between people. In other words, the internet is part of our physical world and our physical world embraces the internet. The round-trip of reality to digital bits and back means that all life is now data – it is all about the capture, extraction, augmentation, interpretation, transformation, composition, propagation and last but not least, the volume of data. The Data Engineer will be the most important software engineering position for decades to come.” Read more
Charles Silver of Algebraix recently shared his opinions on artificial intelligence‘s recently revamped popularity and growing plausibility. Silver writes, “Just a few months ago, the phrase ‘artificial intelligence’ suddenly started being tossed around presentations, blogs, headlines, seminars — even a Facebook earnings meeting — as if it were the most benign concept in the world. AI could actually win an Oscar, thanks to Scarlett Johansson’s riveting voice-only performance as Samantha, the AI-enabled OS in the new movie ‘Her’. One reason for AI’s new respectability: Big steps have been made in solving the problems of artificial intelligence, especially in speech recognition and concept communication. Just think about how casually we now accept machines that can understand and talk, from Apple’s Siri to IBM’s ‘Jeopardy’-winning Watson.” Read more
Washington, DC – January 21, 2014 – The new release (2.1) of Stardog, a leading RDF database, hits new scalability heights with a 50-fold increase over previous versions. Using commodity server hardware at the $10,000 price point, Stardog can manage, query, search, and reason over datasets as large as 50B RDF triples.
The new scalability increases put Stardog into contention for the largest semantic technology, linked data, and other graph data enterprise projects. Stardog’s unique feature set, including reasoning and integrity constraint validation, at large scale means it will increasingly serve as the basis for complex software projects.
“We’re really happy about the new scalability of Stardog,” says Mike Grove, Clark & Parsia’s Chief Software Architect, “which makes us competitive with a handful of top graph database systems. And our feature set is unmatched by any of them.”
The new scalability work required software engineering to remove garbage collection pauses during query evaluation, which the 2.1 release also accomplishes. Along with a new hot backup capability, Stardog is more mature and production-capable than ever before.
We reported yesterday on the news that JSON-LD has reached Recommendation status at W3C. Three formal vocabularies also reached that important milestone yesterday:
The W3C Documentation for The Data Catalog Vocabulary (DCAT), says that DCAT “is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web….By using DCAT to describe datasets in data catalogs, publishers increase discoverability and enable applications easily to consume metadata from multiple catalogs. It further enables decentralized publishing of catalogs and facilitates federated dataset search across sites. Aggregated DCAT metadata can serve as a manifest file to facilitate digital preservation.”
Meanwhile, The RDF Data Cube Vocabulary addresses the following issue: “There are many situations where it would be useful to be able to publish multi-dimensional data, such as statistics, on the web in such a way that it can be linked to related data sets and concepts. The Data Cube vocabulary provides a means to do this using the W3C RDF (Resource Description Framework) standard. The model underpinning the Data Cube vocabulary is compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations. The Data Cube vocabulary is a core foundation which supports extension vocabularies to enable publication of other aspects of statistical data flows or other multidimensional data sets.”
Lastly, W3C now recommends use of the Organization Ontology, “a core ontology for organizational structures, aimed at supporting linked data publishing of organizational information across a number of domains. It is designed to allow domain-specific extensions to add classification of organizations and roles, as well as extensions to support neighbouring information such as organizational activities.”
Ivan Herman Discusses Lead Role At W3C Digital Publishing Activity — And Where The Semantic Web Can Fit In Its Work
There’s a (fairly) new World Wide Web Consortium (W3C) activity, the Digital Publishing Activity, and it’s headed up by Ivan Herman, formerly the Semantic Web Activity Lead there. That activity was subsumed in December by the W3c Data Activity, with Phil Archer taking the role as Lead (see our story here).
Begun last summer, the Digital Publishing Activity has, as Herman describes it, “millions of aspects, some that have nothing to do with the semantic web.” But some, happily, that do – and that are extremely important to the publishing community, as well.
NEXT PAGE >>