DOME Registry: implementing community-wide recommendations for reporting supervised machine learning in biology.

Journal: GigaScience
PMID:

Abstract

Supervised machine learning (ML) is used extensively in biology and deserves closer scrutiny. The Data Optimization Model Evaluation (DOME) recommendations aim to enhance the validation and reproducibility of ML research by establishing standards for key aspects such as data handling and processing, optimization, evaluation, and model interpretability. The recommendations help to ensure that key details are reported transparently by providing a structured set of questions. Here, we introduce the DOME registry (URL: registry.dome-ml.org), a database that allows scientists to manage and access comprehensive DOME-related information on published ML studies. The registry uses external resources like ORCID, APICURON, and the Data Stewardship Wizard to streamline the annotation process and ensure comprehensive documentation. By assigning unique identifiers and DOME scores to publications, the registry fosters a standardized evaluation of ML methods. Future plans include continuing to grow the registry through community curation, improving the DOME score definition and encouraging publishers to adopt DOME standards, and promoting transparency and reproducibility of ML in the life sciences.

Authors

  • Omar Abdelghani Attafi
    Department of Biomedical Sciences, University of Padova, Padova 35131, Italy.
  • Damiano Clementel
    Department of Biomedical Sciences, University of Padova, Padova 35131, Italy.
  • Konstantinos Kyritsis
  • Emidio Capriotti
    Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via Francesco Selmi 3, 40126 Bologna, Italy.
  • Gavin Farrell
    ELIXIR Hub, Hinxton, Cambridge CB10 1SD, UK.
  • Styliani-Christina Fragkouli
    Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki 570 01, Greece.
  • Leyla Jael Castro
    ZB MED - Information Centre for Life Sciences, Gleueler Str. 60, Cologne, 50931, Germany.
  • AndrĂ¡s Hatos
    Department of Oncology, Geneva University Hospitals, Geneva 1205, Switzerland.
  • Tom Lenaerts
    Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan.
  • Stanislav Mazurenko
    Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Masaryk University, Brno, Czech Republic.
  • Soroush Mozaffari
    Department of Biomedical Sciences, University of Padova, Padova 35131, Italy.
  • Franco Pradelli
    Department of Biomedical Sciences, University of Padova, Padova 35131, Italy.
  • Patrick Ruch
    BiTeM Group, Information Science Department, University of Applied Sciences of Western Switzerland (HES-SO, HEG), Switzerland.
  • Castrense Savojardo
  • Paola Turina
    Department of Pharmacy and Biotechnology, University of Bologna, Bologna 40126, Italy.
  • Federico Zambelli
    Department of Biosciences, University of Milan, Milan 20133, Italy.
  • Damiano Piovesan
    Department of Biomedical Sciences, University of Padova, Padova 35131, Italy.
  • Alexander Miguel Monzon
    Department of Information Engineering, University of Padova, Padova 35131, Italy.
  • Fotis Psomopoulos
    Institute of Applied Biosciences, Centre for Research and Technology Hellas, Thessaloniki 570 01, Greece.
  • Silvio C E Tosatto