Developing deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records.

Journal: Journal of biomedical informatics
Published Date:

Abstract

OBJECTIVE: The accuracy of deep learning models for many disease prediction problems is affected by time-varying covariates, rare incidence, covariate imbalance and delayed diagnosis when using structured electronic health records data. The situation is further exasperated when predicting the risk of one disease on condition of another disease, such as the hepatocellular carcinoma risk among patients with nonalcoholic fatty liver disease due to slow, chronic progression, the scarce of data with both disease conditions and the sex bias of the diseases. The goal of this study is to investigate the extent to which the aforementioned issues influence deep learning performance, and then devised strategies to tackle these challenges. These strategies were applied to improve hepatocellular carcinoma risk prediction among patients with nonalcoholic fatty liver disease.

Authors

  • Zhao Li
    Research Center for Data Hub and Security, Zhejiang Lab, Hangzhou, China. lzjoey@gmail.com.
  • Lan Lan
  • Yujia Zhou
    Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, CT 06510, United States.
  • Ruoxing Li
    McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, TX 77030, USA.
  • Kenneth D Chavin
    Department of Surgery, Case Western Reserve University School of Medicine, 11100 Euclid Ave, Cleveland, OH 44106, USA.
  • Hua Xu
    Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Liang Li
    School of Psychological and Cognitive Sciences, Peking University, Beijing, 100871, China.
  • David J H Shih
    School of Biomedical Sciences, The University of Hong Kong, Hong Kong Special Administrative Region.
  • W Jim Zheng
    Center for Computational Biomedicine, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA, Department of Public Health Science, Medical University of South Carolina, 135 Cannon Street, Suite 303, Charleston, SC 29425, USA and Department of Investigational Cancer Therapeutics, Institute for Personalized Cancer Therapy, UT-MD Anderson Cancer Center, 1400 Holcombe Blvd., FC8.3044, Houston, TX 77030, USA wenjin.j.zheng@uth.tmc.edu.