Applying interpretable machine learning in computational biology-pitfalls, recommendations and opportunities for new developments.

Journal: Nature methods
Published Date:

Abstract

Recent advances in machine learning have enabled the development of next-generation predictive models for complex computational biology problems, thereby spurring the use of interpretable machine learning (IML) to unveil biological insights. However, guidelines for using IML in computational biology are generally underdeveloped. We provide an overview of IML methods and evaluation techniques and discuss common pitfalls encountered when applying IML methods to computational biology problems. We also highlight open questions, especially in the era of large language models, and call for collaboration between IML and computational biology researchers.

Authors

  • Valerie Chen
    Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
  • Muyu Yang
    Computational Biology Department, School of Computer Science, Carnegie Mellon University, United States. Electronic address: https://twitter.com/muyu_wendy_yang.
  • Wenbo Cui
    Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
  • Joon Sik Kim
    Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
  • Ameet Talwalkar
    Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. talwalkar@cmu.edu.
  • Jian Ma