Max Smolaks of Tech Week Europe reports, “Andrew Fogg, co-founder of the UK start-up Import.io, thinks every web resource should have an Application Programming Interface (API). In order to make online data more accessible, his company turns any website into a spreadsheet or an API, for free. Fogg claims that in the past few months, the users of this service have created more Web APIs than the rest of the Internet combined. Jerome Bouteiller has interviewed the entrepreneur at LeWeb 2013 conference in Paris, where the two discussed the future of the company and the idea of the Semantic Web, proposed by the ‘father of the Internet’ Sir Tim Berners-Lee.” Read more
Posts Tagged ‘database’
Our discussion of Big Data at SemTechBiz, begun here, continues:
The Enterprise Linked Data Cloud Needs Semantics, And More
Another exploration of Big Data’s intersection with semantic technology will take place at this session, where Dr. Giovanni Tummarello, senior research fellow at DERI and CTO of SindiceTech, will talk about the former becoming an enabler for the latter to be really useful in enterprises. “A lot of people say it’s via Big Data that semantic technologies like RDF will see a coming of age and clear applications in certain industries,” he says. There’s value to adding data first and understanding it later, and to that end, “semantic technologies give you the most agile tool to deal with data you don’t know, where there’s a lot of diversity, and you don’t know what of it particularly will be useful.”
Last month at its MarkLogic World 2013 conference, the enterprise NoSQL database platform provider talked semantics as it related to its MarkLogic Server technology that ingests, manages and searches structured, semi-structured, and unstructured data (see our story here). The vendor late last week was scheduled to provide an early access release of MarkLogic 7, formally due by year’s end, to some dozens of initial users.
“People see a convergence of search and semantics,” Stephen Buxton, Director, Product Management, recently told The Semantic Web Blog. To that end, a lot of the vendor’s customers have deployed MarkLogic technology as well as specialized triple stores, but what they really want, he says, is an integrated approach, “a single database that does both individually and both together,” he says. “We see the future of search as semantics and the future of semantics as search, and they are very much converging.” At its recent conference, Buxton says the company demonstrated a MarkLogic app it built to function like Google’s Knowledge Graph to provide an idea of the kinds of things the enterprise might do with both search and semantics together.
Following up on the comments made by MarkLogic CEO Gary Bloom at his keynote address at the conference, Buxton explained that, “the function in MarkLogic we are working on in engineering is a way to store and manage triples in the MarkLogic database natively, right alongside structured and unstructured information – a specialized triples index so queries are very fast, and so you can do SPARQL queries in MarkLogic. So, with MarkLogic 7 we will have a world-class triple store and world-beating information store – no one else does documents, values and triples in combination the way MarkLogic 7 will.”
Philipp Gratzel Von Gratz of Medical Xpress writes, “Imagine a hospital where patient data from numerous sources is made accessible to ward physicians with the help of hyperlinks and intelligent indexing. Imagine a healthcare system that hands its patients – not an envelope or a CD-ROM – but an integrated dataset that allows them to truly understand their illness, and even use the Internet to obtain additional information. Imagine a radiologist who uses semantic technologies to navigate smoothly through the myriad imaging data. Welcome to the future of semantic technologies in health information retrieval.” Read more
Enterprise NoSQL database platform provider MarkLogic has come into some cash: a $25 million round of growth capital from investors including Sequoia Capital, Tenaya Capital, Northgate Capital, CEO Gary Bloom and other corporate executives. Yesterday, at the company’s MarkLogic World 2013 conference, Bloom also prepared the audience to hear more today from company executives about MarkLogic’s next steps in semantics for its MarkLogic Server technology that ingests, manages and searches structured, semi-structured, and unstructured data.
“The way to think about this is that when we look at semantics, we didn’t … say we just want to check a box on semantics,” Bloom said, by working with partners on some low-hanging fruit – although it will be collaborating with them on various semantic enrichment capabilities. “We think semantics is critical technology, and more interesting I believe is that it is a critical technology that is both a search technology as well as a database technology.” Others in the marketplace will focus on changing their search engines to do semantics, but optimum results won’t come if all that’s being done is layering in semantics at the search level, he said.
Algebraix Data Corporation today announced its SPARQL Server(TM) RDF database successfully executed all 17 of its queries on the SP2 benchmark up to one billion triples on one computer node. The SP2 benchmark is the most computationally complex for testing SPARQL performance and no other vendor has reported results for all queries on data sizes above five million triples. Read more
Marc Joffe of the OKF reports, “Throughout the Eurozone, credit rating agencies have been under attack for their lack of transparency and for their pro-cyclical sovereign rating actions. In the humble belief that the crowd can outperform the credit rating oracles, we are introducing an open database of historical sovereign risk data. It is available at http://www.publicsectorcredit.org/sovdef/ where community members can both view and edit the data. Once the quality of this data is sufficient, the data set can be used to create unbiased, transparent models of sovereign credit risk. The database contains central government revenue, expenditure, public debt and interest costs from the 19th century through 2011 – along with crisis indicators taken from Reinhart and Rogoff’s public database.” Read more
Algebraix Data recently announced “its SPARQL Server™ RDF database is executing the SP2Bench benchmark more than three times faster than reported in June 2012. The dramatic performance improvement is made possible by an algebraic query optimizer that is able to reuse work performed to answer prior queries. Furthermore, SPARQL Server’s Resource Description Framework (RDF) load performance has improved significantly, loading 384,000 triples per second from one file on a workstation class system. This is more than five times faster than June’s performance and is several times faster than any current vendor published results for loading triples from one file.” Read more
Querying semantic databases isn’t necessarily the most user-friendly thing to do on the planet. Consultancy ABComputing is trying to change that, with its EQL (Entity Query Language) technology.
“We wanted to where possible have it so the syntax was more closely mirrored with SQL than with SPARQL because people understand SQL,” says Martin Bradford, primary developer at the company. “If you build on that knowledge, that helps matters.”
EQL came about from the company’s work on a potential contract that involved semantic technology. Exposure to the world of semantic web technologies and SPARQL in particular led Antonia Bradford, who started the firm a couple of decades ago, to conclude that there had to be a better way of working with RDF data without sacrificing the power inherent in the semantic web.
Clark & Parsia’s Stardog lightweight RDF database is moving into release candidate 1.0 mode just in time for next week’s upcoming Semantic Technology & Business Conference in San Francisco next week. The product’s been stable and useable for awhile now, but a 1.0 nomenclature still carries weight with a good number of IT buyers.
The focus for the product, says cofounder and managing principal Kendall Clark, is to be optimized for what he says is the fat part of the market – and that’s not the part that is dealing with a trillion RDF triples. “Most people and organizations don’t need to scale to trillions of anything,” though scaling up, and up, and up, is where most of Clark & Parsia’s competitors have focused their attention, he says. “We’ve seen a significant percentage of what people are doing with semantic technology and most applications are not at a billion triples today.” Take as an example Clark & Parsia’s customer, NASA, which built an expertise location system based on semantic technology that today is still not more than 20 million triples. “You might say that’s a little toy but not if you are at NASA and need defined experts, it is a real, valuable thing and we see this all the time,” he says.
NEXT PAGE >>