Development of a robust corpus for automated evaluation of online health information in Chinese using the DISCERN scale.

Journal: Journal of the American Medical Informatics Association : JAMIA

Published Date: Feb 1, 2026

Abstract

OBJECTIVE: To develop the first comprehensive, standardized annotated corpus of Chinese online health information (OHI) using the full 16-item DISCERN instrument and to establish a reliable annotation process that supports automated quality assessment. MATERIALS AND METHODS: We assembled 510 web-sourced articles on breast cancer, arthritis, and depression. All the articles were independently annotated by three trained raters using the DISCERN scale. Annotation followed a four-step workflow: data collection and preprocessing, rater training, iterative annotation, and quality control. Raters calibrated through consensus sessions and calibration articles. The Dawid-Skene model aggregated individual annotations into final consensus scores. Original five-point ratings were retained and binarized (scores 1-3 as low quality, 4-5 as high quality) to enable both fine-grained and coarse evaluation for machine learning. RESULTS: Initial annotation of a 60-article pilot produced low agreement (mean Krippendorff's α ≈ 0.022) due to subjective variability. Successive calibration exercises improved agreement markedly, culminating in a corpus-wide Krippendorff's α of 0.834. Consensus scores correlated strongly with individual rater scores, confirming annotation robustness. The dual-scale design yielded a relatively balanced distribution of labels across topics, with roughly equal representation of low- and high-quality articles, and preserved granularity for detailed DISCERN analysis. DISCUSSION: Our iterative calibration approach and consensus modeling effectively addressed the subjective ambiguity inherent in quality assessment. The binary and five-class labeling strategies facilitate flexible downstream applications, allowing automated systems to perform both broad filtering and nuanced quality differentiation. The high inter-rater reliability demonstrates that rigorous training and consensus methods can overcome domain-specific annotation challenges. CONCLUSION: The resulting Chinese OHI corpus, annotated via a standardized DISCERN framework and refined through iterative calibration, provides a robust benchmark for training and evaluating machine learning models. This resource lays the foundation for scalable, reliable automated quality assessment of OHI in Chinese public health settings.

Authors

Ting E

Bloomberg School of Public Health,Johns Hopkins University, MD, 21205, United States.
Xingxi Li

Department of Industrial Engineering, Tsinghua University, Beijing, 100084, China.
Jun Liang

Department of AI and IT, Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, People's Republic of China.
Junhao Ma

College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266590, China.
Qichuan Fang

School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, Zhejiang Province, 310053, China.
Shanli Chen

School of Public Health, Southwest Medical University, Luzhou, Sichuan Province, 646000, China.
Jianbo Lei

Clinical Research Center, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, People's Republic of China.
Christopher G Chute

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (41223037)

Development of a robust corpus for automated evaluation of online health information in Chinese using the DISCERN scale.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Development of a robust corpus for automated evaluation of online health information in Chinese using the DISCERN scale.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals