Material Detail
Sparse Linear Models Explain Phenotypic Variation and Predict Risk of Complex Disease
This video was recorded at NIPS Workshops, Sierra Nevada 2011. A central goal of medical genetics is to create models that accurately predict complex disease given genotype. To maximize predictive value and identify causal single-nucleotide polymorphisms (SNPs), all SNPs should be modeled simultaneously. Lasso penalized models have proven to be a useful class of such models, for detecting causal SNPs and for modeling disease risk. Here, we present a comprehensive analysis of real case/control data using lasso-penalized models. Our models accurately discriminated cases from controls in celiac disease and type 1 diabetes, and strongly replicated across independent datasets with validation AUC of 0.84 for type 1 diabetes and 0.82–0.9 for celiac disease, the latter across four independent datasets of different European ethnicities. The models also explained substantial phenotypic variance in independent validation: 22% for type 1 diabetes and 21–38% for celiac disease. This study shows that supervised learning approaches can address missing phenotypic variance and reliably predict incidence of celiac disease and type 1 diabetes from genotype.
Quality
- User Rating
- Comments
- Learning Exercises
- Bookmark Collections
- Course ePortfolios
- Accessibility Info