SemTechBiz SF SemTechBiz UK SemTechBiz NYC more TVNewser TVSpy GalleyCat AppNewser UnBeige AgencySpy PRNewser 10,000 Words FishbowlNY FishbowlLA FishbowlDC MediaJobsDaily SocialTimes AllFacebook AllTwitter

Semantic Web and Cloud Computing Technologies: Cancer’s Killer App?

lungcancer.jpg
There’s been progress on a project that’s relying on semantic web technology and the cloud to help detect biomarkers for lung cancer in its early stages. As reported here, the project is sponsored by the non-profit Canary Foundation and the National Cancer Institute’s Early Detection Research Network (EDRN), and leverages a translational research informatics platform to be provisioned by GenoLogics Life Sciences Software and the NASA Jet Propulsion Laboratory (JPL).

Recently the project went live with connecting diverse proteomics and genomics data it’s collected from seven different research groups affiliated with the program into a closed portal site. On tap for next month are talks with NASA JPL about analyzing the data via a machine-driven approach.

Photo courtesy: Flickr/Pulmonary Pathology


Each group involved in the project, such as British Columbia Cancer Agency and the University of Texas SouthWestern Medical Center, knows how to analyze its localized data, explains James DeGreef, VP market strategy for Genologics. The portal now provides the means for humans to do collaborating on the collected data — such as cell lines and patient tumor samples — which has now been processed from its raw state.

“But we want to use a computer-driven approach to do some pathway models [which relates to how genes and proteins interact in pathways] and data mining,” says DeGreef. “That’s where [the connection to] NASA JPL will help.” Through APIs, the data will be converted so that it can utilize semantic web formats such as RDF. JPL experts with lots of experience with interplanetary data analysis using semantic web technologies will bring their insight about turning so much data into knowledge to this program.

And that’s no small task given the tremendous number of variables involved. “The complexity is massive — there’s a lot more noise and confounding factors, as a lot of cancers are being reclassified as multiple diseases as opposed to one, and every human is different too,” says DeGreef. “So the problem is massive, but solving the problem would provide huge benefits to humanity.” (See below for why:)

lungcancer.gif

The project is continuing its plans to rely on the Amazon EC2 cloud as a resource for the massive data analysis that will be required. DeGreef recently was at Amazon for a workshop on next-generation sequencing and data analysis in the cloud using Hadoop, and he says much of the project’s efforts will be focused on this for the next few months.

“The semantic web is awesome, but it does lead to needing a lot of computational power to do the analysis,” DeGreef says. The public cloud made more sense as the supplier of that power than situating data center resources in one location, given the number of research universities involved and the external funding by the National Cancer Institute. “It would take IT too long to provision and this way there’s no politics,” he says. “Within a day we had this system all provisioned and also you can scale up computing on data analysis as you need it — and we’ll definitely need it.”

• Don’t forget to propose your startup for our Semantic Web Impact Awards. The deadline is Sept. 15.

RELATED:

    None

SemTechBiz is Less Than 2 Weeks Away

The Semantic Tech & Business Conference (SemTechBiz) is coming to San Francisco on June 3-7! Join us for case studies, innovative panels, tutorials, and keynotes that will provide you with practical advice, hands-on guidance, and breakthrough approaches to solving business problems with semantic technology. Passes go up $200 at the door. Sign up now and save !