lunes, 17 de febrero de 2014

Empirical prediction of genomic susceptibilities for multiple cancer classes || PNAS | Mobile

PNAS | Mobile



Empirical prediction of genomic susceptibilities for multiple cancer classes

 Authors

Significance

It is widely assumed that human genomic variations are associated with an individual’s susceptibility to complex diseases such as cancers and autoimmune diseases. However, extensive genome-wide association studies thus far had limited success because the results have low predictive value of practical utility to individuals. We present a prediction process where two machine-learning analysis methods are applied to two different descriptors of each individual’s common genomic variations to predict an individual’s susceptibility to eight major cancer traits. The accuracy of the prediction ranges from 33% to 57% depending on cancer type, which is significantly better than 11% for a random prediction, with probability estimates that may be useful for making practical health decisions for individuals or for a population.

Abstract

An empirical approach is presented for predicting the genomic susceptibility of an individual to the most likely one among nine traits, consisting of eight major cancer classes plus a healthy trait. We use four prediction methods by applying two supervised learning algorithms to two different descriptors of common genomic variations (the profiles of genotypes of SNPs and SNP syntaxes with low P values or low frequencies) of each individual genome from normal cells. All four methods made correct predictions substantially better than random predictions for most cancer classes, but not for some others. A combination of the four results using Bayesian inference better predicted overall than any individual method. The multiclass accuracy of the combined prediction ranges from 33% to 56% depending on cancer classes of testing sets, compared with 11% for a random prediction among nine traits. Despite limited SNP data available and the absence of rare SNPs in public databases, at present, the results suggest that the framework of this approach or its improvement can predict cancer susceptibility with probability estimates useful for making health decisions for individuals or for a population.

Footnotes

  • 1Present address: Department of Computer Science, Genome Center, University of California, Davis, CA 95616.
  • 2To whom correspondence should be addressed. E-mail: sunghou@berkeley.edu.
  • Author contributions: M.K. and S.-H.K. designed research; M.K. and S.-H.K. performed research; M.K. contributed new reagents/analytic tools; M.K. and S.-H.K. analyzed data; and M.K. and S.-H.K. wrote the paper.
  • The authors declare no conflict of interest.
  • This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1318383110/-/DCSupplemental.
Freely available online through the PNAS open access option.

No hay comentarios:

Publicar un comentario