Robust double machine learning model with application to omics data.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: Recently, there has been a growing interest in combining causal inference with machine learning algorithms. Double machine learning model (DML), as an implementation of this combination, has received widespread attention for their expertise in estimating causal effects within high-dimensional complex data. However, the DML model is sensitive to the presence of outliers and heavy-tailed noise in the outcome variable. In this paper, we propose the robust double machine learning (RDML) model to achieve a robust estimation of causal effects when the distribution of the outcome is contaminated by outliers or exhibits symmetrically heavy-tailed characteristics.

Authors

  • Xuqing Wang
    College of Computer Science and Technology, Qingdao University, Qingdao, Shandong Province, China.
  • Yahang Liu
    Department of Biostatistics, Key Laboratory of Public Health Safety of Ministry of Education, Key Laboratory for Health Technology Assessment, National Commission of Health, School of Public Health, Fudan University, Shanghai, China.
  • Guoyou Qin
    Department of Biostatistics, Key Laboratory for Health Technology Assessment, National Commission of Health, Key Laboratory of Public Health Safety of Ministry of Education, School of Public Health, Fudan University, Shanghai, China. gyqin@fudan.edu.cn.
  • Yongfu Yu
    Department of Biostatistics, Key Laboratory for Health Technology Assessment, National Commission of Health, Key Laboratory of Public Health Safety of Ministry of Education, School of Public Health, Fudan University, Shanghai, China. yu@fudan.edu.cn.