Evaluating the performance of large language models in haematopoietic stem cell transplantation decision-making.

Journal: British journal of haematology
Published Date:

Abstract

In a first-of-its-kind study, we assessed the capabilities of large language models (LLMs) in making complex decisions in haematopoietic stem cell transplantation. The evaluation was conducted not only for Generative Pre-trained Transformer 4 (GPT-4) but also conducted on other artificial intelligence models: PaLm 2 and Llama-2. Using detailed haematological histories that include both clinical, molecular and donor data, we conducted a triple-blind survey to compare LLMs to haematology residents. We found that residents significantly outperformed LLMs (p = 0.02), particularly in transplant eligibility assessment (p = 0.01). Our triple-blind methodology aimed to mitigate potential biases in evaluating LLMs and revealed both their promise and limitations in deciphering complex haematological clinical scenarios.

Authors

  • Ivan Civettini
    Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
  • Arianna Zappaterra
    Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
  • Bianca Maria Granelli
    Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
  • Giovanni Rindone
    Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
  • Andrea Aroldi
    Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
  • Stefano Bonfanti
    Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
  • Federica Colombo
    Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
  • Marilena Fedele
    Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
  • Giovanni Grillo
    Department of Haematology and Bone Marrow Transplantation Unit, ASST Grande Ospedale Metropolitano Niguarda, Milan, Italy.
  • Matteo Parma
    Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
  • Paola Perfetti
    Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
  • Elisabetta Terruzzi
    Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
  • Carlo Gambacorti-Passerini
    Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
  • Daniele Ramazzotti
    Department of Computer Science, Stanford University, Stanford, CA, USA.
  • Fabrizio Cavalca
    Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.