A patient-centered approach to developing and validating a natural language processing model for extracting patient-reported symptoms.

Journal: Scientific reports

Published Date: Jul 29, 2025

Abstract

Patient-reported symptoms provide valuable insights into patient experiences and can enhance healthcare quality; however, effectively capturing them remains challenging. Although natural language processing (NLP) models have been developed to extract adverse events and symptoms from medical records written by healthcare professionals, limited studies have focused on models designed for patient-generated narratives. This study developed an NLP model to extract patient-reported symptoms from pharmaceutical care records and validated its effectiveness in analyzing diverse patient-generated narratives. The target dataset comprised "Subjective" sections of pharmaceutical care records created by community pharmacists for patients prescribed anticancer drugs. Two annotation guidelines were applied to develop robust ground-truth data, which was used to develop and evaluate a new transformer-based named entity recognition model. Model performance was compared with that of an existing tool for Japanese clinical texts and tested on external patient-generated blog data to evaluate generalizability. The newly developed BERT-CRF model significantly outperformed the existing model, achieving an F1 score > 0.8 on pharmaceutical care records and extracting > 98% of physical symptom entries from patient blogs, with a 20% improvement over the existing tool. These findings highlight the importance of fine-tuning models using patient-specific narrative data to capture nuanced and colloquial symptom expressions.

Authors

Satoshi Watabe

Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan.
Yuki Yanagisawa

Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan.
Kyoko Sayama

Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan.
Sakura Yokoyama

Division of Drug Informatics, Keio University Faculty of Pharmacy, 1-5-30, Shibakoen, Minato-ku, Tokyo, 105-8512, Japan.
Mitsuhiro Someya

Nakajima Pharmacy, Hokkaido, Japan.
Ryoo Taniguchi

Nakajima Pharmacy, Hokkaido, Japan.
Shuntaro Yada

Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara, Japan.
Eiji Aramaki

Nara Institute of Science and Technology (NAIST), Japan.
Hayato Kizaki

Keio University Faculty of Pharmacy, Division of Drug Informatics, Tokyo, Japan.
Masami Tsuchiya

Division of Drug Informatics, Keio University Faculty of Pharmacy, Tokyo, Japan.
Shungo Imai

Faculty of Pharmaceutical Sciences, Hokkaido University, Sapporo, Japan.
Satoko Hori

Keio University Faculty of Pharmacy, Division of Drug Informatics, Tokyo, Japan.

Keywords

Electronic Health Records Humans Natural Language Processing Patient Reported Outcome Measures Patient-Centered Care

External Resources

View on PubMed Access via DOI PubMed (40730600)

A patient-centered approach to developing and validating a natural language processing model for extracting patient-reported symptoms.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

A patient-centered approach to developing and validating a natural language processing model for extracting patient-reported symptoms.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals