Benjamin Young of Cloudant reports, “Data is often stored and distributed in esoteric formats… Even when the data is available in a parse-able format (CSV, XML, JSON, etc), there is often little provided with the data to explain what’s inside. If there is descriptive meta data provided, it’s often only meant for the next developer to read when implementing yet-another-parser for said data. Really, it’s all quite abysmal… Enter, JSON-LD! JSON-LD (JSON Linked Data) is a simple way of providing semantic meaning for the terms and values in a JSON document. Providing that meaning with the JSON means that the next developer’s application can parse and understand the JSON you gave them.” Read more
Posts Tagged ‘database’
Tom Simonite of the MIT Technology Review recently wrote, “For all its success, Google’s famous Page Rank algorithm has never understood a word of the billions of Web pages it has directed people to over the years. That’s why in 2010 Google acquired Metaweb, a company building a database intended to give computers the ability to understand the world. Two years later the company’s technology resurfaced as the Knowledge Graph. John Giannandrea, vice president of engineering at Google and a Metaweb cofounder, says that will lead to Google’s future products being able to truly understand the people who use them and the things they care about. He told MIT Technology Review’s Tom Simonite how a data store designed to link together all the knowledge on Earth might do that.” Read more
Max Smolaks of Tech Week Europe reports, “Andrew Fogg, co-founder of the UK start-up Import.io, thinks every web resource should have an Application Programming Interface (API). In order to make online data more accessible, his company turns any website into a spreadsheet or an API, for free. Fogg claims that in the past few months, the users of this service have created more Web APIs than the rest of the Internet combined. Jerome Bouteiller has interviewed the entrepreneur at LeWeb 2013 conference in Paris, where the two discussed the future of the company and the idea of the Semantic Web, proposed by the ‘father of the Internet’ Sir Tim Berners-Lee.” Read more
Our discussion of Big Data at SemTechBiz, begun here, continues:
The Enterprise Linked Data Cloud Needs Semantics, And More
Another exploration of Big Data’s intersection with semantic technology will take place at this session, where Dr. Giovanni Tummarello, senior research fellow at DERI and CTO of SindiceTech, will talk about the former becoming an enabler for the latter to be really useful in enterprises. “A lot of people say it’s via Big Data that semantic technologies like RDF will see a coming of age and clear applications in certain industries,” he says. There’s value to adding data first and understanding it later, and to that end, “semantic technologies give you the most agile tool to deal with data you don’t know, where there’s a lot of diversity, and you don’t know what of it particularly will be useful.”
Last month at its MarkLogic World 2013 conference, the enterprise NoSQL database platform provider talked semantics as it related to its MarkLogic Server technology that ingests, manages and searches structured, semi-structured, and unstructured data (see our story here). The vendor late last week was scheduled to provide an early access release of MarkLogic 7, formally due by year’s end, to some dozens of initial users.
“People see a convergence of search and semantics,” Stephen Buxton, Director, Product Management, recently told The Semantic Web Blog. To that end, a lot of the vendor’s customers have deployed MarkLogic technology as well as specialized triple stores, but what they really want, he says, is an integrated approach, “a single database that does both individually and both together,” he says. “We see the future of search as semantics and the future of semantics as search, and they are very much converging.” At its recent conference, Buxton says the company demonstrated a MarkLogic app it built to function like Google’s Knowledge Graph to provide an idea of the kinds of things the enterprise might do with both search and semantics together.
Following up on the comments made by MarkLogic CEO Gary Bloom at his keynote address at the conference, Buxton explained that, “the function in MarkLogic we are working on in engineering is a way to store and manage triples in the MarkLogic database natively, right alongside structured and unstructured information – a specialized triples index so queries are very fast, and so you can do SPARQL queries in MarkLogic. So, with MarkLogic 7 we will have a world-class triple store and world-beating information store – no one else does documents, values and triples in combination the way MarkLogic 7 will.”
Philipp Gratzel Von Gratz of Medical Xpress writes, “Imagine a hospital where patient data from numerous sources is made accessible to ward physicians with the help of hyperlinks and intelligent indexing. Imagine a healthcare system that hands its patients – not an envelope or a CD-ROM – but an integrated dataset that allows them to truly understand their illness, and even use the Internet to obtain additional information. Imagine a radiologist who uses semantic technologies to navigate smoothly through the myriad imaging data. Welcome to the future of semantic technologies in health information retrieval.” Read more
Enterprise NoSQL database platform provider MarkLogic has come into some cash: a $25 million round of growth capital from investors including Sequoia Capital, Tenaya Capital, Northgate Capital, CEO Gary Bloom and other corporate executives. Yesterday, at the company’s MarkLogic World 2013 conference, Bloom also prepared the audience to hear more today from company executives about MarkLogic’s next steps in semantics for its MarkLogic Server technology that ingests, manages and searches structured, semi-structured, and unstructured data.
“The way to think about this is that when we look at semantics, we didn’t … say we just want to check a box on semantics,” Bloom said, by working with partners on some low-hanging fruit – although it will be collaborating with them on various semantic enrichment capabilities. “We think semantics is critical technology, and more interesting I believe is that it is a critical technology that is both a search technology as well as a database technology.” Others in the marketplace will focus on changing their search engines to do semantics, but optimum results won’t come if all that’s being done is layering in semantics at the search level, he said.
Algebraix Data Corporation today announced its SPARQL Server(TM) RDF database successfully executed all 17 of its queries on the SP2 benchmark up to one billion triples on one computer node. The SP2 benchmark is the most computationally complex for testing SPARQL performance and no other vendor has reported results for all queries on data sizes above five million triples. Read more
Marc Joffe of the OKF reports, “Throughout the Eurozone, credit rating agencies have been under attack for their lack of transparency and for their pro-cyclical sovereign rating actions. In the humble belief that the crowd can outperform the credit rating oracles, we are introducing an open database of historical sovereign risk data. It is available at http://www.publicsectorcredit.org/sovdef/ where community members can both view and edit the data. Once the quality of this data is sufficient, the data set can be used to create unbiased, transparent models of sovereign credit risk. The database contains central government revenue, expenditure, public debt and interest costs from the 19th century through 2011 – along with crisis indicators taken from Reinhart and Rogoff’s public database.” Read more
Algebraix Data recently announced “its SPARQL Server™ RDF database is executing the SP2Bench benchmark more than three times faster than reported in June 2012. The dramatic performance improvement is made possible by an algebraic query optimizer that is able to reuse work performed to answer prior queries. Furthermore, SPARQL Server’s Resource Description Framework (RDF) load performance has improved significantly, loading 384,000 triples per second from one file on a workstation class system. This is more than five times faster than June’s performance and is several times faster than any current vendor published results for loading triples from one file.” Read more
NEXT PAGE >>