Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning
Journal:
arXiv
Published Date:
Jun 30, 2025
Abstract
Congenital heart disease (CHD) presents complex, lifelong challenges often
underrepresented in traditional clinical metrics. While unstructured narratives
offer rich insights into patient and caregiver experiences, manual thematic
analysis (TA) remains labor-intensive and unscalable. We propose a fully
automated large language model (LLM) pipeline that performs end-to-end TA on
clinical narratives, which eliminates the need for manual coding or full
transcript review. Our system employs a novel multi-agent framework, where
specialized LLM agents assume roles to enhance theme quality and alignment with
human analysis. To further improve thematic relevance, we optionally integrate
reinforcement learning from human feedback (RLHF). This supports scalable,
patient-centered analysis of large qualitative datasets and allows LLMs to be
fine-tuned for specific clinical contexts.