Evaluating the performance of large language models in haematopoietic stem cell transplantation decision-making.

Journal: British journal of haematology

Published Date: Dec 9, 2023

Abstract

In a first-of-its-kind study, we assessed the capabilities of large language models (LLMs) in making complex decisions in haematopoietic stem cell transplantation. The evaluation was conducted not only for Generative Pre-trained Transformer 4 (GPT-4) but also conducted on other artificial intelligence models: PaLm 2 and Llama-2. Using detailed haematological histories that include both clinical, molecular and donor data, we conducted a triple-blind survey to compare LLMs to haematology residents. We found that residents significantly outperformed LLMs (p = 0.02), particularly in transplant eligibility assessment (p = 0.01). Our triple-blind methodology aimed to mitigate potential biases in evaluating LLMs and revealed both their promise and limitations in deciphering complex haematological clinical scenarios.

Authors

Ivan Civettini

Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
Arianna Zappaterra

Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
Bianca Maria Granelli

Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
Giovanni Rindone

Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
Andrea Aroldi

Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
Stefano Bonfanti

Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
Federica Colombo

Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
Marilena Fedele

Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
Giovanni Grillo

Department of Haematology and Bone Marrow Transplantation Unit, ASST Grande Ospedale Metropolitano Niguarda, Milan, Italy.
Matteo Parma

Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
Paola Perfetti

Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
Elisabetta Terruzzi

Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
Carlo Gambacorti-Passerini

Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
Daniele Ramazzotti

Department of Computer Science, Stanford University, Stanford, CA, USA.
Fabrizio Cavalca

Department of Haematology and Bone Marrow Trasplantation Unit, Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.

Keywords

Artificial Intelligence Hematopoietic Stem Cell Transplantation Humans Language Tissue Donors

External Resources

View on PubMed Access via DOI PubMed (38070128)

Evaluating the performance of large language models in haematopoietic stem cell transplantation decision-making.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals