A machine learning-based framework to identify type 2 diabetes through electronic health records.

Journal: International journal of medical informatics

Published Date: Oct 1, 2016

Abstract

OBJECTIVE: To discover diverse genotype-phenotype associations affiliated with Type 2 Diabetes Mellitus (T2DM) via genome-wide association study (GWAS) and phenome-wide association study (PheWAS), more cases (T2DM subjects) and controls (subjects without T2DM) are required to be identified (e.g., via Electronic Health Records (EHR)). However, existing expert based identification algorithms often suffer in a low recall rate and could miss a large number of valuable samples under conservative filtering standards. The goal of this work is to develop a semi-automated framework based on machine learning as a pilot study to liberalize filtering criteria to improve recall rate with a keeping of low false positive rate.

Authors

Tao Zheng

Guangzhou Institute of Energy Conversion, Chinese Academy of Sciences, Guangzhou 510640, People's Republic of China; Key Laboratory of Renewable Energy, Chinese Academy of Sciences, Guangzhou 510640, People's Republic of China. Electronic address: zhengtao@ms.giec.ac.cn.
Wei Xie

Department of Electrical Engineering & Computer Science, Vanderbilt University, Nashville, TN 37232, United States of America.
Liling Xu

Tongren Hospital Shanghai Jiao Tong University, Shanghai, China.
Xiaoying He

Department of Endocrinology, the First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China.
Ya Zhang

Department of Plant Protection, College of Plant Protection, Hunan Agricultural University, Changsha, China. Electronic address: zhangya230@126.com.
Mingrong You

Division of Epidemiology, Vanderbilt University, Nashville, TN, USA.
Gong Yang

Division of Epidemiology, Vanderbilt University, Nashville, TN, USA.
You Chen

Dept. of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN, USA.

Keywords

Algorithms Bayes Theorem Diabetes Mellitus, Type 2 Electronic Health Records Genome-Wide Association Study Humans Logistic Models Machine Learning Pilot Projects Support Vector Machine

External Resources

View on PubMed Access via DOI PubMed (27919371)

A machine learning-based framework to identify type 2 diabetes through electronic health records.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals