Posts Tagged ‘SPARQL’

Building The Scientific Knowledge Graph

saimgeStandard Analytics, which was a participant at the recent TechStars event in New York City, has a big goal on its mind: To organize the world’s scientific information by building a complete scientific knowledge graph.

The company’s co-founders, Tiffany Bogich and Sebastien Ballesteros,came to the conclusion that someone had to take on the job as a result of their own experience as researchers. A problem they faced, says Bogich, was being able to access all the information behind published results, as well as search and discover across papers. “Our thesis is that if you can expose the moving parts – the data, code, media – and make science more discoverable, you can really advance and accelerate research,” she says.

Read more

End of Support for the Sindice.com search engine: history, lessons learned, and legacy (Guest Post)

[Editor's Note: Since 2007, Sindice.com has served as a specialized search engine allowing Semantic Web practitioners and researchers to locate structured data on the Web. At the peak of its activity, Sindice.com had an index of over 700M pages and processed 20M pages per day. In a post last week, the founding team announced the end of support for Sindice.com to concentrate on delivering the technology developed for the engine to enterprise users. This week, SemanticWeb.com is proud to host a guest post by the founding team explaining the history, the challanges and the future of this technology.]

Photo of the Sindice Team, 2012

Photo of the Sindice Team, 2012

The word “Sindice” has been around for quite some time in research and practice on the “Semantic Web” or “lets see how we can turn the web into a database”.

Since 2007, Sindice.com has served as a specialized search engine that would do a crazy thing: throw away the text and just concentrate on the “markup” of the web pages. Sindice would provide an advanced API to query RDF, RDFa, Microformats and Microdata found on web sites, together with a number of other services. Sindice turned useful, we guess, as approximately 1100 scientific works in the last few years refer to it in a way or another.

Last week, we the founding team announced the end of our support of the original Sindice.com semantic search engine to concentrate on the technology that came from it.

With the launch in 2012 of Schema.org, Google and others have effectively embraced the vision of the “Semantic Web.” With the RDFa standard, and now even more with JSON-LD, richer markup is becoming more and more popular on websites. While there might not be public web data “search APIs”, large collections of crawled data (pages and RDF) exist today which are made available on cloud computing platforms for easy analysis with your favorite big data paradigm.

Even more interestingly, the technology of Sindice.com has been made available in several projects maintained either as open source (see below) or commercially supported by the Sindice.com team now transitioned in the Sindice LTD company, AKA SindiceTech.

It has been quite a journey for us, and given there is no single summary anywhere we thought we’d take this occasion to write and share it.

This is both for “historical” reasons and as a way to glimpse at future directions of this field and these technologies.

Read more

CHAIN-REDS Project Enhances Semantic Search And Extends Reproducibility Of Scientific Data

chainredspixThe CHAIN-REDS FP7 project, co-funded by the European Commission, has as a goal building a knowledge base of information, gathered both from dedicated surveys and other web and document sources, for largely more than half of the countries in the world, which it presents to visitors through geographic maps and tables. Earlier this month, its Knowledge Base and Semantic Search Engine for exploring the more than 30 million documents in its Open Access Document Repositories (OADR) and Data Repositories (DR) became available in a smartphone and tablet app, while the results of its Semantic Search Engine also now are ranked according to the January 2014 Ranking Web of Repositories. So, users conducting searches should see results in the order of the highest-ranked repositories.

The project has its roots in using semantic web technologies to correlate the data used to write scientific papers with the documents themselves whenever possible, says Prof. Roberto Barbera, of the Department of Physics and Astronomy at the University of Catania, as well as with applications that can be used to analyse the information. To drive to these ends, the CHAIN-REDS consortium semantically enriched its repositories and built its search engine on the related Linked Data. Users in search of information can get papers and data and, if applications are available, can be redirected to them on the project’s cloud infrastructure to reproduce and reanalyze the data.

“There is a huge effort in the scientific world about the reproducibility of science,” says Barbera.

Read more

The Web Is 25 — And The Semantic Web Has Been An Important Part Of It

web25NOTE: This post was updated at 5:40pm ET.

Today the Web celebrates its 25th birthday, and we celebrate the Semantic Web’s role in that milestone. And what a milestone it is: As of this month, the Indexed Web contains at least 2.31 billion pages, according to WorldWideWebSize.  

The Semantic Web Blog reached out to the World Wide Web Consortium’s current and former semantic leads to get their perspective on the roads The Semantic Web has traveled and the value it has so far brought to the Web’s table: Phil Archer, W3C Data Activity Lead coordinating work on the Semantic Web and related technologies; Ivan Herman, who last year transitioned roles at the W3C from Semantic Activity Lead to Digital Publishing Activity Lead; and Eric Miller, co-founder and president of Zepheira and the leader of the Semantic Web Initiative at the W3C until 2007.

While The Semantic Web came to the attention of the wider public in 2001, with the publication in The Scientific American of The Semantic Web by Tim Berners-Lee, James Hendler and Ora Lassila, Archer points out that “one could argue that the Semantic Web is 25 years old,” too. He cites Berners-Lee’s March 1989 paper, Information Management: A Proposal, that includes a diagram that shows relationships that are immediately recognizable as triples. “That’s how Tim envisaged it from Day 1,” Archer says.

Read more

RDF 1.1 is a W3C Recommendation

RDF 1.1Almost exactly 10 years after the publication of RDF 1.0 (10 Feb 2004, http://www.w3.org/TR/rdf-concepts/), the World Wide Web Consortium (W3C) has announced today that RDF 1.1 has become a “Recommendation.” In fact, the RDF Working Group has published a set of eight Resource Description Framework (RDF) Recommendations and four Working Group Notes. One of those notes, the RDF 1.1 primer, is a good starting place for those new to the standard.

SemanticWeb.com caught up with Markus Lanthaler, co-editor of the RDF 1.1 Concepts and Abstract Syntax document, to discuss this news.

photo of Markus LanthalerLanthaler said of the recommendation, “Semantic Web technologies are often criticized for their complexity–mostly because RDF is being conflated with RDF/XML. Thus, with RDF 1.1 we put a strong focus on simplicity. The new specifications are much more accessible and there’s a clear separation between RDF, the data model, and its serialization formats. Furthermore, the primer provides a great introduction for newcomers. I’m convinced that, along with the standardization of Turtle (and previously JSON-LD), this will mark an important point in the history of the Semantic Web.”

Read more

First of Four Getty Vocabularies Made Available as Linked Open Data

Getty Vocabularies - Linked Open Data logoJim Cuno, the President and CEO of the Getty, announced yesterday that the Getty Research Institute has released the Art & Architecture Thesaurus (AAT) ® as Linked Open Data. Cuno said, “The Art & Architecture Thesaurus is a reference of over 250,000 terms on art and architectural history, styles, and techniques. It’s one of the Getty Research Institute’s four Getty Vocabularies, a collection of databases that serves as the premier resource for cultural heritage terms, artists’ names, and geographical information, reflecting over 30 years of collaborative scholarship.”

The data set is available for download at vocab.getty.edu under an Open Data Commons Attribution License (ODC BY 1.0). Vocab.getty.edu offers a SPARQL endpoint, as well as links to the Getty’s Semantic Representation documentation, the Getty Ontology, links for downloading the full data sets, and more.

Read more

Keep On Keeping On

“There is nothing more difficult to plan, more doubtful of success, nor more dangerous to manage than the creation of a new order of things…. Whenever his enemies have the ability to attack the innovator, they do so with the passion of partisans, while the others defend him sluggishly, so that the innovator and his party alike are vulnerable.”
–Niccolò Machiavelli, The Prince (1513)

Atlanta's flying car laneIn case you missed it, a series of recent articles have made a Big Announcement:

The Semantic Web is not here yet.

Additionally, neither are flying cars, the cure for cancer, humans traveling to Mars or a bunch of other futuristic ideas that still have merit.

A problem with many of these articles is that they conflate the Vision of the Semantic Web with the practical technologies associated with the standards. While the Whole Enchilada has yet to emerge (and may never do so), the individual technologies are finding their way into ever more systems in a wide variety of industries. These are not all necessarily on the public Web, they are simply Webs of Data. There are plenty of examples of this happening and I won’t reiterate them here.

Instead, I want to highlight some other things that are going on in this discussion that are largely left out of these narrowly-focused, provocative articles.

First, the Semantic Web has a name attached to its vision and it has for quite some time. As such, it is easy to remember and it is easy to remember that it Hasn’t Gotten Here Yet. Every year or so, we have another round of articles that are more about cursing the darkness than lighting candles.

In that same timeframe, however, we’ve seen the ascent and burn out failure of Service-Oriented Architectures (SOA), Enterprise Service Buses (ESBs), various MVC frameworks, server side architectures, etc. Everyone likes to announce $20 million sales of an ESB to clients. No one generally reports on the $100 million write-downs on failed initiatives when they surface in annual reports a few years later. So we are left with a skewed perspective on the efficacy of these big “conventional” initiatives.

Read more

Hello 2014 (Part 2)

rsz_lookahead2

Courtesy: Flickr/faul

Picking up from where we left off yesterday, we continue exploring where 2014 may take us in the world of semantics, Linked and Smart Data, content analytics, and so much more.

Marco Neumann, CEO and co-founder, KONA and director, Lotico: On the technology side I am personally looking forward to make use of the new RDF1.1 implementations and the new SPARQL end-point deployment solutions in 2014 The Semantic Web idea is here to stay, though you might call it by a different name (again) in 2014.

Bill Roberts, CEO, Swirrl:   Looking forward to 2014, I see a growing use of Linked Data in open data ‘production’ systems, as opposed to proofs of concept, pilots and test systems.  I expect good progress on taking Linked Data out of the hands of specialists to be used by a broader group of data users.

Read more

Hello 2014

rsz_lookaheadone

Courtesy: Flickr/Wonderlane

Yesterday we said a fond farewell to 2013. Today, we look ahead to the New Year, with the help, once again, of our panel of experts:

Phil Archer, Data Activity Lead, W3C:

For me the new Working Groups (WG) are the focus. I think the CSV on the Web WG is going to be an important step in making more data interoperable with Sem Web.

I’d also like to draw attention to the upcoming Linking Geospatial Data workshop in London in March. There have been lots of attempts to use Geospatial data with Linked Data, notably GeoSPARQL of course. But it’s not always easy. We need to make it easier to publish and use data that includes geocoding in some fashion along with the power and functionality of Geospatial Information systems. The workshop brings together W3C, OGC, the UK government [Linked Data Working Group], Ordnance Survey and the geospatial department at Google. It’s going to be big!

[And about] JSON-LD: It’s JSON so Web developers love it, and it’s RDF. I am hopeful that more and more JSON will actually be JSON-LD. Then everyone should be happy.

Read more

Good-Bye 2013

Courtesy: Flickr/MadebyMark

Courtesy: Flickr/MadebyMark

As we prepare to greet the New Year, we take a look back at the year that was. Some of the leading voices in the semantic web/Linked Data/Web 3.0 and sentiment analytics space give us their thoughts on the highlights of 2013.

Read on:

 

Phil Archer, Data Activity Lead, W3C:

The completion and rapid adoption of the updated SPARQL specs, the use of Linked Data (LD) in life sciences, the adoption of LD by the European Commission, and governments in the UK, The Netherlands (NL) and more [stand out]. In other words, [we are seeing] the maturation and growing acknowledgement of the advantages of the technologies.

I contributed to a recent study into the use of Linked Data within governments. We spoke to various UK government departments as well as the UN FAO, the German National Library and more. The roadblocks and enablers section of the study (see here) is useful IMO.

Bottom line: Those organisations use LD because it suits them. It makes their own tasks easier, it allows them to fulfill their public tasks more effectively. They don’t do it to be cool, and they don’t do it to provide 5-Star Linked Data to others. They do it for hard headed and self-interested reasons.

Christine Connors, founder and information strategist, TriviumRLG:

What sticks out in my mind is the resource market: We’ve seen more “semantic technology” job postings, academic positions and M&A activity than I can remember in a long time. I think that this is a noteworthy trend if my assessment is accurate.

There’s also been a huge increase in the attentions of the librarian community, thanks to long-time work at the Library of Congress, from leading experts in that field and via schema.org.

Read more

NEXT PAGE >>