Training a Bilingual Artificial Intelligence Model on Pediatric Orthopaedic Resources to Enhance the Readability of Patient Education Materials.
Journal:
Journal of the Pediatric Orthopaedic Society of North America
Published Date:
Oct 30, 2025
Abstract
BACKGROUND: Low health literacy affects nearly one-third of adults in the United States, and almost 68 million Americans speak a language other than English at home. These barriers to understanding medical information are associated with worse clinical outcomes, higher readmission rates, and increased postoperative complications. Despite longstanding recommendations from the National Institutes of Health (NIH) and American Medical Association (AMA) to write patient educational materials (PEMs) at a fifth- sixth-grade level, many orthopaedic PEMs remain above the eighth-grade level and are not consistently available in patients' preferred languages. This study evaluates whether a trained artificial intelligence (AI) model can improve the readability of the existing pediatric orthopaedic PEMs in English and Spanish languages without sacrificing accuracy or key content. METHODS: Two ChatGPT-4 models were developed and trained, one in English and one in Spanish, using 51 English and 10 Spanish pediatric orthopaedic PEMs from the American Academy of Orthopaedic Surgeons (AAOS) OrthoInfo website. Models were instructed to rewrite PEMs at a fifth- to sixth-grade reading level, emphasizing anatomy, symptoms, physical examination, and treatment options. Training included multiple feedback rounds until readability targets were met. Readability was assessed using validated indices (Flesch-Kincaid, SMOG (Simplified Measure of Gobbledygook, Coleman-Liau, and Automated Readability Index for English; Fernández Huerta, Szigriszt Pazos, and Crawford for Spanish). Ten AI-enhanced English-language PEMs were reviewed for accuracy by a pediatric orthopaedic surgeon. RESULTS: The AI models significantly improved readability across all measures. In English, the mean grade level decreased from 9.8 to 5.8 (P < .001) and Flesch Reading Ease improved from 56.9 to 76.4 (P < .001). In Spanish, Fernández Huerta scores improved from 81.0 to 100.7 (P < .001), reducing grade level from fifth-sixth to fourth. All the 10 reviewed PEMs were considered clinically accurate. CONCLUSIONS: A fine-tuned AI model, trained specifically on validated pediatric orthopaedic materials, substantially improved readability of PEMs in English and Spanish languages while maintaining accuracy. This approach enhances accessibility without introducing new content, aligning PEMs with NIH and AMA recommendations. Future research should assess patient comprehension and clinical outcomes with AI-enhanced materials. KEY CONCEPTS: (1)Artificial intelligence: Artificial intelligence (AI) refers to computer systems capable of performing tasks that typically require human intelligence, such as language processing; in this study, AI was used to revise and assess the readability of patient education materials.(2)Model: An AI model is a computer algorithm trained on specific data and taught to generate text or perform tasks based on the inputted dataset and assigned parameters. In this study, ChatGPT-4 was used to create a customized model which was trained on validated patient information on pediatric orthopaedic conditions.(3)Patient education materials: Patient education materials (PEMs) are written or visual tools designed to inform patients and families about medical conditions, treatments, and procedures; they play a critical role in supporting shared decision-making in pediatric orthopaedics.(4)Readability: Readability refers to how easily a written text can be understood by a target audience; improving the readability of PEMs ensures that patients and caregivers can comprehend essential health information.(5)Health literacy: Health literacy is the ability of individuals to obtain, process, and understand basic health information needed to make informed decisions; enhancing the readability of PEMs supports improved health literacy in pediatric populations. LEVEL OF EVIDENCE: IV.
Authors
Keywords
No keywords available for this article.