Performance of artificial intelligence chatbots in interpreting clinical images of pressure injuries.

Journal: Wound repair and regeneration : official publication of the Wound Healing Society [and] the European Tissue Repair Society
Published Date:

Abstract

To evaluate the accuracy of AI chatbots in staging pressure injuries according to the National Pressure Injury Advisory Panel (NPIAP) Staging through clinical image interpretation, a cross-sectional design was conducted to assess five leading publicly available AI chatbots. As a result, three chatbots were unable to interpret the clinical images, whereas GPT-4 Turbo achieved a high accuracy rate (83.0%) in staging pressure injuries, notably outperforming BingAI Creative mode (24.0%) with statistical significance (p < 0.001). GPT-4 Turbo accurately identified Stages 1 (p < 0.001), 3 (p = 0.001), and 4 (p < 0.001) pressure injuries, and suspected deep tissue injuries (p < 0.001), while BingAI demonstrated significantly lower accuracy across all stages. The findings highlight the potential of AI chatbots, especially GPT-4 Turbo, in accurately diagnosing images and aiding the subsequent management of pressure injuries.

Authors

  • Makoto Shiraishi
    Department of Plastic and Reconstructive Surgery, The University of Tokyo Hospital, Tokyo, Japan.
  • Koji Kanayama
    Department of Plastic and Reconstructive Surgery, The University of Tokyo Hospital, Tokyo, Japan.
  • Daichi Kurita
    Department of Plastic and Reconstructive Surgery, The University of Tokyo Hospital, Tokyo, Japan.
  • Yuta Moriwaki
    Department of Plastic and Reconstructive Surgery, The University of Tokyo Hospital, Tokyo, Japan.
  • Mutsumi Okazaki
    Department of Plastic and Reconstructive Surgery, The University of Tokyo Hospital, Tokyo, Japan.