Comparison of the Accuracy, Completeness, Reproducibility, and Consistency of Different AI Chatbots in Providing Nutritional Advice: An Exploratory Study.

Journal: Journal of clinical medicine
Published Date:

Abstract

: The use of artificial intelligence (AI) chatbots for obtaining healthcare advice is greatly increased in the general population. This study assessed the performance of general-purpose AI chatbots in giving nutritional advice for patients with obesity with or without multiple comorbidities. : The case of a 35-year-old male with obesity without comorbidities (Case 1), and the case of a 65-year-old female with obesity, type 2 diabetes mellitus, sarcopenia, and chronic kidney disease (Case 2) were submitted to 10 different AI chatbots on three consecutive days. Accuracy (the ability to provide advice aligned with guidelines), completeness, and reproducibility (replicability of the information over the three days) of the chatbots' responses were evaluated by three registered dietitians. Nutritional consistency was evaluated by comparing the nutrient content provided by the chatbots with values calculated by dietitians. : Case 1: ChatGPT 3.5 demonstrated the highest accuracy rate (67.2%) and Copilot the lowest (21.1%). ChatGPT 3.5 and ChatGPT 4.0 achieved the highest completeness (both 87.3%), whereas Gemini and Copilot recorded the lowest scores (55.6%, 42.9%, respectively). Reproducibility was highest for Chatsonic (86.1%) and lowest for ChatGPT 4.0 (50%) and ChatGPT 3.5 (52.8%). Case 2: Overall accuracy was low, with no chatbot achieving 50% accuracy. Completeness was highest for ChatGPT 4.0 and Claude (both 77.8%), and lowest for Copilot (23.3%). ChatGPT 4.0 and Pi Ai showed the lowest reproducibility. Major inconsistencies regarded the amount of protein recommended by most chatbots, which suggested simultaneously to both reduce and increase protein intake. General-purpose AI chatbots exhibited limited accuracy, reproducibility, and consistency in giving dietary advice in complex clinical scenarios and cannot replace the work of an expert dietitian.

Authors

  • Valentina Ponzo
    Department of Medical Science, University of Turin, 10126 Torino, Italy.
  • Rosalba Rosato
    Department of Psychology, University of Turin, 10124 Torino, Italy.
  • Maria Carmine Scigliano
    Dietetic and Clinical Nutrition Unit, Città della Salute e della Scienza Hospital, 10126 Torino, Italy.
  • Martina Onida
    Department of Medical Science, University of Turin, 10126 Torino, Italy.
  • Simona Cossai
    Dietetic and Clinical Nutrition Unit, Città della Salute e della Scienza Hospital, 10126 Torino, Italy.
  • Morena De Vecchi
    Dietetic and Clinical Nutrition Unit, Città della Salute e della Scienza Hospital, 10126 Torino, Italy.
  • Andrea Devecchi
    Department of Medical Science, University of Turin, 10126 Torino, Italy.
  • Ilaria Goitre
    Department of Medical Science, University of Turin, 10126 Torino, Italy.
  • Enrica Favaro
    Department of Medical Science, University of Turin, 10126 Torino, Italy.
  • Fabio Dario Merlo
    Dietetic and Clinical Nutrition Unit, Città della Salute e della Scienza Hospital, 10126 Torino, Italy.
  • Domenico Sergi
    Department of Translatioal Medicine, University of Ferrara, 44121 Ferrara, Italy.
  • Simona Bo
    Department of Medical Science, University of Turin, 10126 Torino, Italy.

Keywords

No keywords available for this article.