Comparing Large Language Models as Health Literacy Tools: Evaluating and Simplifying Texts on gender-Affirming Surgery.

Journal: Journal of health communication

Published Date: Aug 19, 2025

Abstract

Patient-facing materials in gender-affirming surgery are often written at a level higher than the NIH-recommended eighth grade reading level for patient education materials. In efforts to make patient resources more accessible, ChatGPT has successfully optimized linguistic content for patients seeking care in various medical fields. This study aims to evaluate and compare the ability of large language models (LLMs) to analyze readability and simplify online patient-facing resources for gender-affirming procedures. Google Incognito searches were performed on 15 terms relating to gender-affirming surgery. The first 20 text results were analyzed for reading level difficulty by an online readability calculator, Readability Scoring System v2.0 (RSS). Eight easily accessible LLMs were used to assess texts for readability and simplify texts to an eighth grade reading level, which were reevaluated by the RSS. Descriptive statistics, t-tests, and one-way ANOVA tests were used for statistical analyses. Online resources were written with a mean reading grade level of 12.66 ± 2.54. Google Gemini was most successful at simplifying texts (8.39 ± 1.49), followed by Anthropic Claude (9.53 ± 1.85) and ChatGPT 4 (10.19 ± 1.83). LLMs had a greater margin of error when assessing readability of feminizing and facial procedures and when simplifying genital procedures ( < .017) Online texts on gender-affirming procedures are written with a readability more challenging than is recommended for patient-facing resources. Certain LLMs were better at simplifying texts than others. Providers should use caution when using LLMs for patient education in gender-affirming care, as they are prone to variability and bias.

Authors

Victoria N Yi

Duke University School of Medicine, Durham, North Carolina, USA.
Angel P Scialdone

Duke University School of Medicine, Durham, North Carolina, USA.
Ann Marie Flusche

Department of Neurosurgery, Duke University Medical Center, Durham, NC 27710, USA.
Kendall Reitz

Duke University School of Medicine, Durham, North Carolina, USA.
Holly C Lewis

Division of Plastic Surgery, Northwestern Feinberg School of Medicine, Chicago, Illinois, USA.
William M Tian

Division of Plastic, Maxillofacial, and Oral Surgery, Duke University, Durham, North Carolina.
Elda Fisher

Department of Surgery, Division of Plastic, Maxillofacial, and Oral Surgery, Duke University Medical Center, Durham, North Carolina, USA.
Kristen Rezak

Department of Surgery, Division of Plastic, Maxillofacial, and Oral Surgery, Duke University Medical Center, Durham, North Carolina, USA.
Ash Patel

Department of Surgery, Division of Plastic, Maxillofacial, and Oral Surgery, Duke University Medical Center, Durham, North Carolina, USA.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40827817)

Comparing Large Language Models as Health Literacy Tools: Evaluating and Simplifying Texts on gender-Affirming Surgery.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Comparing Large Language Models as Health Literacy Tools: Evaluating and Simplifying Texts on gender-Affirming Surgery.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals