Building Tools for Machine Learning and Artificial Intelligence in Cancer Research: Best Practices and a Case Study with the PathML Toolkit for Computational Pathology.

Journal: Molecular cancer research : MCR
Published Date:

Abstract

Imaging datasets in cancer research are growing exponentially in both quantity and information density. These massive datasets may enable derivation of insights for cancer research and clinical care, but only if researchers are equipped with the tools to leverage advanced computational analysis approaches such as machine learning and artificial intelligence. In this work, we highlight three themes to guide development of such computational tools: scalability, standardization, and ease of use. We then apply these principles to develop PathML, a general-purpose research toolkit for computational pathology. We describe the design of the PathML framework and demonstrate applications in diverse use cases. PathML is publicly available at www.pathml.com.

Authors

  • Jacob Rosenthal
    Dana-Farber Cancer Institute, Boston, MA, USA.
  • Ryan Carelli
    Department of Pathology and Laboratory Medicine, Weill Cornell Medicine and the New York Genome Center, New York, New York.
  • Mohamed Omar
    Trauma Department, Hannover Medical School, Hanover, Germany.
  • David Brundage
    Dana-Farber Cancer Institute, Boston, Massachusetts.
  • Ella Halbert
    Dana-Farber Cancer Institute, Boston, Massachusetts.
  • Jackson Nyman
    Dana-Farber Cancer Institute, Boston, Massachusetts.
  • Surya N Hari
    Dana-Farber Cancer Institute, Boston, Massachusetts.
  • Eliezer M Van Allen
    Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Harvard University, Boston, Massachusetts.
  • Luigi Marchionni
    Weill Cornell Medicine, New York, NY, USA.
  • Renato Umeton
    Dana-Farber Cancer Institute, Boston, MA, USA.
  • Massimo Loda
    Dana-Farber Cancer Institute, Boston, MA, USA.