On Tuesday in London, the Google Art Project was announced. The project includes artworks from 17 of the world’s leading institutions including New York’s Metropolitan Museum of Art, the Museum of Modern Art and the Frick Collection; the Smithsonian’s Freer Gallery of Art in Washington DC; London’s Tate Museum, and museums in Madrid, Moscow, Amsterdam and Florence, among others. The paintings are presented in High Definition, and the site has a wonderful User Interface for exploring the artworks.

Christophe Guéret noticed that there was something missing: machine-readable, semantic data.

Christophe has a PhD in Computer Science and is based in Amsterdam at the Vrije Universiteit where he works on LATC, a European Union funded project.

In a few short hours, between meetings and other work, Cristophe created a semantic wrapper for the Google Art Project, the “GoogleArt2RDF wrapper.”  This wrapper affects the entire Google Art Project eco-system by offering such a wrapping service for any painting made available through GoogleArt.

Let’s take Vincent Van Gogh’s “Starry Night” for example. The human-friendly interface is visible at http://www.googleartproject.com/museums/moma/the-starry-night:

GoogleArt Project - Starry Night by Vincent Van Gogh

Thanks to Christophe, Starry Night now also has RDF data associated with it:
http://linkeddata.few.vu.nl/googleart/museums/moma/the-starry-night.

If you are not used to viewing RDF data, not to worry.  The raw data can be opened in a text-editor, but if you’re just mildly curious, here’s what it looks like (No need to squint! Trust me, it’s RDF).

Semantic Data for Vincent Van Gogh's Starry Night

Now for the technical stuff

The data is expressed using primarily the often-used FOAF (Friend-Of-A-Friend) and Dublin Core ontologies.  When possible, the resources are linked to DBPedia for the author of the painting and the medium used (oil-on-canvas, etc).

When asked about the tools he used, Christophe responded, “Eclipse with the PyDev environment. The code is in Python2 and I use Tornado for serving and Beautifulsoup for parsing.”

I also asked Christophe if he was ever tempted to create a new ontology or if he found FOAF and Dublin Core adequate for his needs.  He replied, “Not at all! Whenever I RDFize some datasets I always try to avoid creating ontologies as much as possible.  It’s tricky to properly define one and it requires spending some time to link it with other popular ontologies anyway if we want it to be useful.”

detailed view of the use of dct:relation

“As you can see, the relation between the painting and other painting made by the same artist is defined by the very vague ‘dc:relation.’  I could have created an ontology for a ‘theCreatorAlsoCreatedThis’ relation to use instead, but then that relation would have come from a very specialised ontology nobody will ever use for anything else – I don’t think what would have been such an improvement as compared to using ‘dc:relation.’”

Now that the data is available as in this format,  it will be interesting to see what people do with it.  “This is a first version of the system which does not yet export all the data from Google,”  says Cristophe.  “Comments and suggestions on how to improve it are much welcome!”

At the moment, the data may only be available for individual paintings, but this is a powerful first step.