Towards Scalable SOAP Note Generation: A Weakly Supervised Multimodal Framework
Journal:
arXiv
Published Date:
Jun 12, 2025
Abstract
Skin carcinoma is the most prevalent form of cancer globally, accounting for
over $8 billion in annual healthcare expenditures. In clinical settings,
physicians document patient visits using detailed SOAP (Subjective, Objective,
Assessment, and Plan) notes. However, manually generating these notes is
labor-intensive and contributes to clinician burnout. In this work, we propose
a weakly supervised multimodal framework to generate clinically structured SOAP
notes from limited inputs, including lesion images and sparse clinical text.
Our approach reduces reliance on manual annotations, enabling scalable,
clinically grounded documentation while alleviating clinician burden and
reducing the need for large annotated data. Our method achieves performance
comparable to GPT-4o, Claude, and DeepSeek Janus Pro across key clinical
relevance metrics. To evaluate clinical quality, we introduce two novel metrics
MedConceptEval and Clinical Coherence Score (CCS) which assess semantic
alignment with expert medical concepts and input features, respectively.