How Real Are Synthetic Therapy Conversations? Evaluating Fidelity in Prolonged Exposure Dialogues
Journal:
arXiv
Published Date:
Apr 30, 2025
Abstract
The growing adoption of synthetic data in healthcare is driven by privacy
concerns, limited access to real-world data, and the high cost of annotation.
This work explores the use of synthetic Prolonged Exposure (PE) therapeutic
conversations for Post-Traumatic Stress Disorder (PTSD) as a scalable
alternative for training and evaluating clinical models. We systematically
compare real and synthetic dialogues using linguistic, structural, and
protocol-specific metrics, including turn-taking patterns and treatment
fidelity. We also introduce and evaluate PE-specific metrics derived from
linguistic analysis and semantic modeling, offering a novel framework for
assessing clinical fidelity beyond surface fluency. Our findings show that
although synthetic data holds promise for mitigating data scarcity and
protecting patient privacy, it can struggle to capture the subtle dynamics of
therapeutic interactions. Synthetic therapy dialogues closely match structural
features of real-world conversations (e.g., speaker switch ratio: 0.98 vs.
0.99); however, they may not adequately reflect key fidelity markers (e.g.,
distress monitoring). We highlight gaps in existing evaluation frameworks and
advocate for fidelity-aware metrics that go beyond surface fluency to uncover
clinically significant failures. Our findings clarify where synthetic data can
effectively complement real-world datasets -- and where critical limitations
remain.