Identifying tweets of personal health experience through word embedding and LSTM neural network.

Journal: BMC bioinformatics

Published Date: Jun 13, 2018

Abstract

BACKGROUND: As Twitter has become an active data source for health surveillance research, it is important that efficient and effective methods are developed to identify tweets related to personal health experience. Conventional classification algorithms rely on features engineered by human domain experts, and engineering such features is a challenging task and requires much human intelligence. The resultant features may not be optimal for the classification problem, and can make it challenging for conventional classifiers to correctly predict personal experience tweets (PETs) due to the various ways to express and/or describe personal experience in tweets. In this study, we developed a method that combines word embedding and long short-term memory (LSTM) model without the need to engineer any specific features. Through word embedding, tweet texts were represented as dense vectors which in turn were fed to the LSTM neural network as sequences.

Authors

Keyuan Jiang

Purdue University Northwest Hammond, USA.
Shichao Feng

Department of Computer Information Technology and Graphics, Purdue University Northwest, Hammond, IN, USA.
Qunhao Song

Department of Computer Information Technology and Graphics, Purdue University Northwest, Hammond, IN, USA.
Ricardo A Calix

Purdue University Northwest, Hammond, USA.
Matrika Gupta

Purdue University Northwest, Hammond, USA.
Gordon R Bernard

Keywords

Algorithms Health Humans Neural Networks, Computer Social Media Vocabulary

External Resources

View on PubMed Access via DOI PubMed (29897323)

Identifying tweets of personal health experience through word embedding and LSTM neural network.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals