Effect of Static vs. Conversational AI-Generated Messages on Colorectal Cancer Screening Intent: a Randomized Controlled Trial
Journal:
arXiv
Published Date:
Jul 10, 2025
Abstract
Large language model (LLM) chatbots show increasing promise in persuasive
communication. Yet their real-world utility remains uncertain, particularly in
clinical settings where sustained conversations are difficult to scale. In a
pre-registered randomized controlled trial, we enrolled 915 U.S. adults (ages
45-75) who had never completed colorectal cancer (CRC) screening. Participants
were randomized to: (1) no message control, (2) expert-written patient
materials, (3) single AI-generated message, or (4) a motivational interviewing
chatbot. All participants were required to remain in their assigned condition
for at least three minutes. Both AI arms tailored content using participant's
self-reported demographics including age and gender. Both AI interventions
significantly increased stool test intentions by over 12 points
(12.9-13.8/100), compared to a 7.5 gain for expert materials (p<.001 for all
comparisons). While the AI arms outperformed the no message control for
colonoscopy intent, neither showed improvement xover expert materials. Notably,
for both outcomes, the chatbot did not outperform the single AI message in
boosting intent despite participants spending ~3.5 minutes more on average
engaging with it. These findings suggest concise, demographically tailored AI
messages may offer a more scalable and clinically viable path to health
behavior change than more complex conversational agents and generic time
intensive expert-written materials. Moreover, LLMs appear more persuasive for
lesser-known and less-invasive screening approaches like stool testing, but may
be less effective for entrenched preferences like colonoscopy. Future work
should examine which facets of personalization drive behavior change, whether
integrating structural supports can translate these modest intent gains into
completed screenings, and which health behaviors are most responsive to
AI-supported guidance.