Data imbalance in drug response prediction: multi-objective optimization approach in deep learning setting.

Journal: Briefings in bioinformatics
Published Date:

Abstract

Drug response prediction (DRP) methods tackle the complex task of associating the effectiveness of small molecules with the specific genetic makeup of the patient. Anti-cancer DRP is a particularly challenging task requiring costly experiments as underlying pathogenic mechanisms are broad and associated with multiple genomic pathways. The scientific community has exerted significant efforts to generate public drug screening datasets, giving a path to various machine learning models that attempt to reason over complex data space of small compounds and biological characteristics of tumors. However, the data depth is still lacking compared to application domains like computer vision or natural language processing domains, limiting current learning capabilities. To combat this issue and improves the generalizability of the DRP models, we are exploring strategies that explicitly address the imbalance in the DRP datasets. We reframe the problem as a multi-objective optimization across multiple drugs to maximize deep learning model performance. We implement this approach by constructing Multi-Objective Optimization Regularized by Loss Entropy loss function and plugging it into a Deep Learning model. We demonstrate the utility of proposed drug discovery methods and make suggestions for further potential application of the work to achieve desirable outcomes in the healthcare field.

Authors

  • Oleksandr Narykov
    Division of Data Science and Learning, Argonne National Laboratory, Lemont, Illinois.
  • Yitan Zhu
    Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL, 60439, USA. yitan.zhu@anl.gov.
  • Thomas Brettin
    Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL, USA.
  • Yvonne A Evrard
    Developmental Therapeutics Branch, National Cancer Institute, Frederick, MD, USA.
  • Alexander Partin
    Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL, 60439, USA.
  • Fangfang Xia
    Computing, Environment, and Life Sciences Directorate, Argonne National Laboratory, Argonne, Illinois, USA.
  • Maulik Shukla
    Computing, Environment, and Life Sciences Directorate, Argonne National Laboratory, Argonne, Illinois, USA.
  • Priyanka Vasanthakumari
    Division of Data Science and Learning, Argonne National Laboratory, Lemont, Illinois.
  • James H Doroshow
    Developmental Therapeutics Branch, National Cancer Institute, Frederick, MD, USA.
  • Rick L Stevens
    Computer Science Department and Computation Institute, University of Chicago, Chicago, Illinois, USA.