GenAI exceeds clinical experts in predicting acute kidney injury following paediatric cardiopulmonary bypass.

Journal: Scientific reports
Published Date:

Abstract

The emergence of large language models (LLMs) opens new horizons to leverage, often unused, information in clinical text. Our study aims to capitalise on this new potential. Specifically, we examine the utility of text embeddings generated by LLMs in predicting postoperative acute kidney injury (AKI) in paediatric cardiopulmonary bypass (CPB) patients using electronic health record (EHR) text, and propose methods for explaining their output. AKI could be a serious complication in paediatric CPB and its accurate prediction can significantly improve patient outcomes by enabling timely interventions. We evaluate various text embedding algorithms such as Doc2Vec, top-performing sentence transformers on Hugging Face, and commercial LLMs from Google and OpenAI. We benchmark the cross-validated performance of these 'AI models' against a 'baseline model' as well as an established clinically-defined 'expert model'. The baseline model includes structured features, i.e., patient gender, age, height, body mass index and length of operation. The majority of AI models surpass, not only the baseline model, but also the expert model. An ensemble of AI and clinical-expert models improves discriminative performance by 23% compared to the baseline model. Consistency of patient clusters formed from AI-generated embeddings with clinical-expert clusters-measured via the adjusted rand index and adjusted mutual information metrics-illustrates the medical validity of LLM embeddings. We create a reverse mapping from the numeric embedding space to the natural-language domain via the embedding-based clusters, generating medical labels for the clusters in the process. We also use text-generating LLMs to summarise the differences between AI and expert clusters. Such 'explainability' outputs can increase medical practitioners' trust in the AI applications, and help generate new hypotheses, e.g., by studying the association of cluster memberships and outcomes of interest.

Authors

  • Mansour Sharabiani
    School of Public Health, Imperial College London, London, UK. mt5605@imperial.ac.uk.
  • Alireza Mahani
    Quantitative Research Davidson Kempner Capital Management New York NY.
  • Alex Bottle
  • Yadav Srinivasan
    Great Ormond Street Hospital, London, UK.
  • Richard Issitt
    Great Ormond Street Hospital, Great Ormond Street Hospital Institute of Child Health and NIHR GOSH BRC, London, UK.
  • Serban Stoica
    Bristol Royal Hospital for Children, Bristol, UK.