REFORMS: Consensus-based Recommendations for Machine-learning-based Science.

Journal: Science advances
PMID:

Abstract

Machine learning (ML) methods are proliferating in scientific research. However, the adoption of these methods has been accompanied by failures of validity, reproducibility, and generalizability. These failures can hinder scientific progress, lead to false consensus around invalid claims, and undermine the credibility of ML-based science. ML methods are often applied and fail in similar ways across disciplines. Motivated by this observation, our goal is to provide clear recommendations for conducting and reporting ML-based science. Drawing from an extensive review of past literature, we present the REFORMS checklist (recommendations for machine-learning-based science). It consists of 32 questions and a paired set of guidelines. REFORMS was developed on the basis of a consensus of 19 researchers across computer science, data science, mathematics, social sciences, and biomedical sciences. REFORMS can serve as a resource for researchers when designing and implementing a study, for referees when reviewing papers, and for journals when enforcing standards for transparency and reproducibility.

Authors

  • Sayash Kapoor
    Department of Computer Science, Princeton University, Princeton, NJ 08544, USA.
  • Emily M Cantrell
    Department of Sociology, Princeton University, Princeton, NJ 08544, USA.
  • Kenny Peng
    Department of Computer Science, Cornell University, Ithaca, NY 14850, USA.
  • Thanh Hien Pham
    Department of Computer Science, Princeton University, Princeton, NJ 08544, USA.
  • Christopher A Bail
    Department of Sociology, Duke University, Durham, NC 27708.
  • Odd Erik Gundersen
    Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway.
  • Jake M Hofman
    Microsoft Research, New York, NY 10012, USA.
  • Jessica Hullman
    Department of Computer Science, Northwestern University, Evanston, IL 60208, USA.
  • Michael A Lones
    Heriot-Watt University, Edinburgh, United Kingdom.
  • Momin M Malik
    Center for Digital Health, Mayo Clinic, Rochester, MN.
  • Priyanka Nanayakkara
    Department of Computer Science, Northwestern University, Evanston, IL 60208, USA.
  • Russell A Poldrack
    Department of Psychology, Stanford University, Stanford, CA, USA.
  • Inioluwa Deborah Raji
    Department of Computer Science, University of California, Berkeley, Berkeley, CA 94720, USA.
  • Michael Roberts
    EPSRC Centre for Mathematical Imaging in Healthcare, University of Cambridge, Cambridge, UK.
  • Matthew J Salganik
    Center for Information Technology Policy, Princeton University, Princeton, NJ 08544, USA.
  • Marta Serra-Garcia
    Rady School of Management, University of California, San Diego, La Jolla, CA 92093, USA.
  • Brandon M Stewart
    Center for Information Technology Policy, Princeton University, Princeton, NJ 08544, USA.
  • Gilles Vandewiele
    Department of Information Technology, Ghent University, Ghent, Belgium.
  • Arvind Narayanan
    Center for Information Technology Policy, Princeton University, Princeton, NJ, USA. aylinc@princeton.edu jjb@alum.mit.edu arvindn@cs.princeton.edu.