VIVO, a semantic information representation system that enables collaboration among scientists across all disciplines, has had a busy summer: The open source project to facilitate the advancement of research and discovery by integrating information about scholars, their activities, and outputs, gained a more permanent home in the DuraSpace Incubator, ensuring a way to continue activities after its NIH grant continuation year ran out. It saw the publication of VIVO: A Semantic Approach to Scholarly Networking and Discovery. And Northwestern University brought its researchers together in a single hub called Northwestern Scholars, an implementation of Elsevier’s SciVal Experts research networking tool (see our story here).

The future is looking pretty bright, too. “We are very interested in funding, research resources, scholarly works, scholars and data sets,” says Mike Conlon, primary investigator of the VIVO project. “As the world moves forward, these things are all inter-related, but that’s been very blurry, especially to organizations and institutions.” Funding agencies, for example, want to know what work was produced as a result of its grants to a major center. It no longer is just a question of who wrote a paper, but who funded it, what tools were behind it, and how was the data produced, and how all these things inter-relate in a scholarly data system.

“Connecting these things becomes the work of the future,” says Conlon.

The VIVO data models can handle that, letting users put the central focus on their own world views to see how other things in the network connect to it. “If your world view is the Large Hadron Collider, you could put that in the middle, see the funding into it, the investigators who worked on it, all those things related to research resources.” That’s a way to help address issues of accountability, of productivity, and of linkage – of people understanding what leads to what, he explains.

There are three ways to implement VIVO as a semantic source of interchangeable data for scholarship around the world: By downloading VIVO’s open source software; by using different software that puts out semantic web data using the VIVO ontology; or using a commercial product that collects information for a specific purpose, such as activity reporting, but also results in the institution having enough information to put up a VIVO. “They get the extra benefits for faculty, like research discovery across the institution and eventually the world, and they can provide data in an open and exchangeable format that the institution went to all the trouble to collect,” Conlon says.

The three approaches have led today to there being more than 100 systems in implementation heading for production of VIVO standard data, he says.

In fact, euroCRIS, the European Organization for International Research Information, is aligning data models so that when it talks about research information systems, its data model and VIVO’s are fully in sync.

Next-Gen Scholarship

VIVO is also helping to push forward next-generation, semantic-web enabled scholarship so that new concepts like micro-attribution can be facilitated. Micro-attribution is telling exactly how authors of a paper contributed to a project – whether it was producing particular data, having a sub-role in producing data, curating the data, tuning certain equipment – to facilitate research partnerships. “There is an unbelievable goldmine for building teams, building next-generation recommendation systems through scholarship to build projects up,” he says. In fact, there is active collaboration with the SONIC (Science of Networks in Communicaties) Laboratory at Northwestern where a recommendation system, the C-IKNOW VIVO, is being developed based on data produced by VIVO.

Another giant concept in the VIVO community, he says, is that of nanopublication. “The idea is we take the scientific work and deconstruct it into the actual scientific assertions being made,” Conlon says. Papers are full of assertions – those made to show the scientists took the right steps, those that are repetitions of facts already known, and those that are actual new assertions, he notes.

“So we can bust the paper to its assertions and tie it together using the semantic framework – then we can say there are very high provenance assertions around particular scientific assertions, or these assertions have been made 8,000 times and are well accepted. And these others are new, and maybe not made by people we know well or whose provenance is high, but they are appearing over and over again and we are starting to build some evidence these things may be true,” he says.

What matters more than the number of papers a scientist writes, because papers are full or repetitions, but “how many new things he had to say and what becomes part of the canon,” Conlon explains.  “That’s what establishes that person’s reputation in the field.”