Tweet Classification Toward Twitter-Based Disease Surveillance: New Data, Methods, and Evaluations.

Journal: Journal of medical Internet research

Published Date: Feb 20, 2019

Abstract

BACKGROUND: The amount of medical and clinical-related information on the Web is increasing. Among the different types of information available, social media-based data obtained directly from people are particularly valuable and are attracting significant attention. To encourage medical natural language processing (NLP) research exploiting social media data, the 13th NII Testbeds and Community for Information access Research (NTCIR-13) Medical natural language processing for Web document (MedWeb) provides pseudo-Twitter messages in a cross-language and multi-label corpus, covering 3 languages (Japanese, English, and Chinese) and annotated with 8 symptom labels (such as cold, fever, and flu). Then, participants classify each tweet into 1 of the 2 categories: those containing a patient's symptom and those that do not.

Authors

Shoko Wakamiya

Nara Institute of Science and Technology (NAIST), Japan.
Mizuki Morita

Okayama University, Okayama, Japan.
Yoshinobu Kano

Faculty of Informatics Shizuoka University Hamamatsu Shizuoka Japan.
Tomoko Ohkuma

Fuji Xerox Co., Ltd., Yokohama, Japan.
Eiji Aramaki

Nara Institute of Science and Technology (NAIST), Japan.

Keywords

Data Mining Databases, Factual Humans Internet Machine Learning Natural Language Processing Population Surveillance Social Media

External Resources

View on PubMed Access via DOI PubMed (30785407)

Tweet Classification Toward Twitter-Based Disease Surveillance: New Data, Methods, and Evaluations.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals