Multitask learning and benchmarking with clinical time series data.

Journal: Scientific data

Published Date: Jun 17, 2019

Abstract

Health care is one of the most exciting frontiers in data mining and machine learning. Successful adoption of electronic health records (EHRs) created an explosion in digital clinical data available for analysis, but progress in machine learning for healthcare research has been difficult to measure because of the absence of publicly available benchmark data sets. To address this problem, we propose four clinical prediction benchmarks using data derived from the publicly available Medical Information Mart for Intensive Care (MIMIC-III) database. These tasks cover a range of clinical problems including modeling risk of mortality, forecasting length of stay, detecting physiologic decline, and phenotype classification. We propose strong linear and neural baselines for all four tasks and evaluate the effect of deep supervision, multitask training and data-specific architectural modifications on the performance of neural models.

Authors

Hrayr Harutyunyan

USC Information Sciences Institute, Marina del Rey, California, 90292, United States of America.
Hrant Khachatrian

YerevaNN, Yerevan, 0025, Armenia. hrant@yerevann.com.
David C Kale

University of Southern California, Los Angeles, CA; Whittier Virtual PICU, Children's Hospital Los Angeles, Los Angeles, CA.
Greg Ver Steeg

USC Information Sciences Institute, Marina del Rey, California, 90292, United States of America.
Aram Galstyan

USC Information Sciences Institute, Marina del Rey, California, 90292, United States of America.

Keywords

Benchmarking Data Mining Databases, Factual Electronic Health Records Humans Machine Learning

External Resources

View on PubMed Access via DOI PubMed (31209213)

Multitask learning and benchmarking with clinical time series data.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals