Learning Predictive Interactions Using Information Gain and Bayesian Network Scoring.

Journal: PloS one

Published Date: Dec 1, 2015

Abstract

BACKGROUND: The problems of correlation and classification are long-standing in the fields of statistics and machine learning, and techniques have been developed to address these problems. We are now in the era of high-dimensional data, which is data that can concern billions of variables. These data present new challenges. In particular, it is difficult to discover predictive variables, when each variable has little marginal effect. An example concerns Genome-wide Association Studies (GWAS) datasets, which involve millions of single nucleotide polymorphism (SNPs), where some of the SNPs interact epistatically to affect disease status. Towards determining these interacting SNPs, researchers developed techniques that addressed this specific problem. However, the problem is more general, and so these techniques are applicable to other problems concerning interactions. A difficulty with many of these techniques is that they do not distinguish whether a learned interaction is actually an interaction or whether it involves several variables with strong marginal effects.

Authors

Xia Jiang

Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15213, United States of America.
Jeremy Jao

Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15213, United States of America.
Richard Neapolitan

Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, United States of America.

Keywords

Algorithms Bayes Theorem Computational Biology Genome-Wide Association Study Machine Learning Polymorphism, Single Nucleotide

External Resources

View on PubMed Access via DOI PubMed (26624895)

Learning Predictive Interactions Using Information Gain and Bayesian Network Scoring.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals