14239805875_d25785a6f8A recent article in Medical Xpress reports, “Machine learning has been improved by Dr Thomas Wilhelm of the Institute of Food Research, which is strategically funded by the Biotechnology and Biological Sciences Research Council. Instead of developing one model from the training data, his technique involves developing hundreds of diverse models, and applying these to independent, unseen data, and seeing which models work best in their ability to predict outcomes. This avoids ‘overfitting’ of a model to a specific training data set. The new technique can be applied to many different situations, but Dr Wilhelm applied it to epigenetic data on cervical cancer.”

The article continues, “Millions of women are screened for cervical cancer in the UK each year, with over 3,000 being diagnosed with the condition. Worldwide it causes 400,000 deaths annually. The human papilloma virus (HPV) is the major cause of cervical cancer, but not every woman infected with it goes on to develop cancer. Epigenetics, and in particular DNA methylation markers, are associated with the onset of cancer.”

It goes on, “Dr Wilhelm used publicly-available case-control data of DNA methylation in women at various stages of developing cervical cancer to look for patterns that indicated a predisposition to the condition. ‘We saw clear patterns of DNA methylation markers that predicted the development of cervical cancer,’ said Dr Wilhelm. ‘Intriguingly, the patterns still predict development of the condition even in women who hadn’t been infected with HPV.’ The new method significantly outperforms previous attempts to analyse patterns in the data. And there is potential for improvement through using larger data sets and looking at more epigenetic markers from the human genome.”

Read more here.

Image: Courtesy Flickr/ NASA’s Marshall Space Flight Center