sccommonsA Drupal ++ platform for semantic web biomedical data – that’s how Sudeshna Das describes eXframe, a reusable framework for creating online repositories of genomics experiments. Das – who among other titles is affiliate faculty of the Harvard Stem Cell Institute – is one of the developers of eXframe, which leverages Stéphane Corlosquet’s RDF module for Drupal to produce, index (into an RDF store powered by the ARC2 PHP library) and publish semantic web data in the second generation version of the platform.

“We used the RDF modules to turn eXframe into a semantic web platform,” says Das. “That was key for us because it hid all the complexities of semantic technology.”

One instance of the platform today can be found in the repository for stem cell data as part of the Stem Cell Commons, the Harvard Stem Cell Institute’s community for stem cell bioinformatics. But Das notes the importance of the reusability aspect of the software platform to build genomics repositories that automatically produce Linked Data as well as a SPARQL endpoint, is that it becomes easy to build new repository instances with much less effort. Working off Drupal as its base, eXframe has been customized to support biomedical data and to integrate biomedical ontologies and knowledge bases.

Plans are in the works to develop an online repository for neurology experiments in the next year or so as another instance of eXframe, for example, and Das hopes that in a couple of years there will be dozens more. Currently the focus is on internal Harvard departments, labs or institutions, but Das would like to expand that to external groups in the future.

Digging In To The Data

The data in the Stem Cell Commons repository is structured and annotated using various ontologies developed by the bioinformatics community, for cell types, organisms, tissues, and more, Das explains. Mapping existing biomedical ontologies to the repository’s RDF data enables interoperability with other resources that take advantage of semantic web data formats. Recently, for instance, she learned eagle-i and the New York Stem Cell Foundation are developing a Semantic Web resource for Induced Puripotent Stem Cells.

“You can do more with the data because you have annotated it using these ontologies,” she says, such as the Disease Ontology and Drug Ontology. “You can ask questions, such as with this drug, what are the different diseases that can be treated and are there models of this disease in the database? So you can take advantage of ontologies and other semantic web knowledge bases or repositories – you can leverage all these to ask questions about our data.”

That said, Das adds that there is work yet to do “to develop an easy-to-use user interface for bench biomedical scientists to ask questions more easily, so that they can use the semantic web to do queries on their data. For our next stage of the project, we’d like to build that easy to use interface, to make it easier for the general public to use this technology without having to learn SPARQL.”

Das considers it to have been a huge help to have been able to use the Drupal RDF module, as she and the eXframe team weren’t themselves semantic web experts, to drive the platform’s development. Now, says Das, “I totally believe in the power of the semantic web for biology.”