Development of a Bariatric Surgery Specific Artificial Intelligence Large Language Model: BariatricSurgeryGPT.

Journal: Surgical innovation
Published Date:

Abstract

BackgroundCommercially available large language models (LLMs) have demonstrated impressive capabilities in processing vast datasets and generating coherent narratives. However, their lack of domain-specific knowledge limits their reliability in clinical applications. This study aimed to develop and evaluate BariatricSurgeryGPT, a fine-tuned LLM specifically tailored for bariatric surgery to provide more accurate and clinically relevant responses to bariatric surgery-related questions.MethodsWe obtained 8764 bariatric surgery research abstracts published between January 1, 2020, and January 1, 2024, from PubMed. These abstracts were preprocessed and tokenized to fine-tune a pre-trained GPT-2 model using PyTorch and HuggingFace frameworks. The model's performance was evaluated using BLEU, METEOR, and ROUGE-1 scores on 20 clinically relevant bariatric surgery questions, each tested across nine temperature settings (0.1-0.9) for both the fine-tuned and baseline GPT-2 models, yielding 360 total evaluation instances.ResultsBariatricSurgeryGPT demonstrated consistent improvements over the baseline GPT-2 model across all metrics. The fine-tuned model achieved a BLEU score of 0.165 (vs 0.147 for baseline, 12.8% improvement), a METEOR score of 0.633 (vs 0.585, 8.2% improvement), and a ROUGE-1 score of 0.267 (vs 0.243, 9.7% improvement). These improvements indicate enhanced precision, recall, and semantic relevance in generating bariatric surgery-specific content.ConclusionBariatricSurgeryGPT represents the first domain-specific LLM for bariatric surgery and demonstrates the feasibility of developing specialty-specific AI tools with improved accuracy for clinical applications. The specialty-specific models could enhance surgical education through interactive learning tools, improve patient communication via personalized educational materials, and support clinical decision-making by providing evidence-based information synthesis.

Authors

  • Berk B Ozmen
    Department of Plastic Surgery, Cleveland Clinic, Cleveland, OH, USA.
  • Ibrahim Berber
    Department of Computer and Data Sciences, Case School of Engineering, Case Western Reserve University, Cleveland, OH, USA.
  • Jerry T Dang
    Digestive Disease Institute, Cleveland Clinic, Cleveland, Ohio. Electronic address: [email protected].
  • Graham S Schwarz
    Department of Plastic Surgery, Cleveland Clinic, Cleveland, OH, USA.
  • Matthew Kroh
    Digestive Disease Institute, Cleveland Clinic Abu Dhabi, Abu Dhabi, United Arab Emirates.

Keywords

No keywords available for this article.