[Study of the performances of conversational agents in interpreting textual description of lesions in pathology].

Journal: Annales de pathologie
Published Date:

Abstract

The applications of artificial intelligence (AI) in anatomical and cytological pathology (ACP) are growing, particularly in image analysis. Conversational agents (CAs) based on large language models may also be useful as helps in the analysis of ACP semeiological features, but their evaluation in this field remains necessary. We submitted standardized, contextualized semeiological descriptions of ACP to ChatGPT, Copilot, Gemini, and Mistral in order to evaluate their responses. Out of 200 texts submitted, response quality varied among the four CAs, with up to 82% (ChatGPT) of correct diagnoses on the first attempt (91% after a follow-up prompt). Serious diagnostic errors ("hallucinations") occurred in as many as 9% of cases (Gemini). In addition to diagnostic suggestions, the CAs sometimes offered classification details, interpretative comments, recommendations for additional analyses, and information on the pathologies. Response heterogeneity was not related to organs, pathology types or frequencies, or the origin of the submitted texts. However, for all four CAs, response quality correlated with the number of search results per pathology on Google, Google Scholar, and PubMed-here used as indirect indicators of the volume of digital documentation accessible online. Expert review of CA outputs in the analysis of ACP text data remains crucial given the risk of inaccurate answers and serious hallucinations from these widely available AI tools, which are already being used by healthcare professionals and patients.

Authors

  • Thomas Coisset
    Service d'anatomie et cytologie pathologiques, CHU de Brest, 29220 Brest, France.
  • Arnaud Uguen
    CHRU Brest, Department of Pathology, Brest, 29220, France.

Keywords

No keywords available for this article.