Learning statistical models of phenotypes using noisy labeled training data.
Journal:
Journal of the American Medical Informatics Association : JAMIA
PMID:
27174893
Abstract
OBJECTIVE: Traditionally, patient groups with a phenotype are selected through rule-based definitions whose creation and validation are time-consuming. Machine learning approaches to electronic phenotyping are limited by the paucity of labeled training datasets. We demonstrate the feasibility of utilizing semi-automatically labeled training sets to create phenotype models via machine learning, using a comprehensive representation of the patient medical record.