Interpretable Probabilistic Latent Variable Models for Automatic Annotation of Clinical Text.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium

Published Date: Nov 5, 2015

Abstract

We propose Latent Class Allocation (LCA) and Discriminative Labeled Latent Dirichlet Allocation (DL-LDA), two novel interpretable probabilistic latent variable models for automatic annotation of clinical text. Both models separate the terms that are highly characteristic of textual fragments annotated with a given set of labels from other non-discriminative terms, but rely on generative processes with different structure of latent variables. LCA directly learns class-specific multinomials, while DL-LDA breaks them down into topics (clusters of semantically related words). Extensive experimental evaluation indicates that the proposed models outperform Naïve Bayes, a standard probabilistic classifier, and Labeled LDA, a state-of-the-art topic model for labeled corpora, on the task of automatic annotation of transcripts of motivational interviews, while the output of the proposed models can be easily interpreted by clinical practitioners.

Authors

Alexander Kotov

Department of Computer Science, Wayne State University.
Mehedi Hasan

Department of Computer Science, Wayne State University.
April Carcone

Pediatric Prevention Research Center, Wayne State University.
Ming Dong

Department of Computer Science, Wayne State University.
Sylvie Naar-King

Pediatric Prevention Research Center, Wayne State University.
Kathryn BroganHartlieb

Department of Dietetics and Nutrition, Florida International University.

Keywords

Adolescent Bayes Theorem Child Humans Interviews as Topic Models, Statistical Natural Language Processing Pediatric Obesity Support Vector Machine

External Resources

View on PubMed PubMed (26958214)

Interpretable Probabilistic Latent Variable Models for Automatic Annotation of Clinical Text.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals