Multi-omics disease module detection with an explainable Greedy Decision Forest.

Journal: Scientific reports
Published Date:

Abstract

Machine learning methods can detect complex relationships between variables, but usually do not exploit domain knowledge. This is a limitation because in many scientific disciplines, such as systems biology, domain knowledge is available in the form of graphs or networks, and its use can improve model performance. We need network-based algorithms that are versatile and applicable in many research areas. In this work, we demonstrate subnetwork detection based on multi-modal node features using a novel Greedy Decision Forest (GDF) with inherent interpretability. The latter will be a crucial factor to retain experts and gain their trust in such algorithms. To demonstrate a concrete application example, we focus on bioinformatics, systems biology and particularly biomedicine, but the presented methodology is applicable in many other domains as well. Systems biology is a good example of a field in which statistical data-driven machine learning enables the analysis of large amounts of multi-modal biomedical data. This is important to reach the future goal of precision medicine, where the complexity of patients is modeled on a system level to best tailor medical decisions, health practices and therapies to the individual patient. Our proposed explainable approach can help to uncover disease-causing network modules from multi-omics data to better understand complex diseases such as cancer.

Authors

  • Bastian Pfeifer
    Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria.
  • Hubert Baniecki
    MI2DataLab, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland.
  • Anna Saranti
    Human-Centered AI Lab, Institute of Forest Engineering, Department of Forest and Soil Sciences, University of Natural Resources and Life Sciences Vienna, 1190 Wien, Austria.
  • Przemysław Biecek
    Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland.
  • Andreas Holzinger
    Human-Centered AI Lab, Medical University of Graz, Graz, Austria.