Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach.

Journal: BMC medical informatics and decision making
Published Date:

Abstract

BACKGROUND: The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note.

Authors

  • Wei-Hung Weng
    Department of Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, 4th Floor, Boston, MA, 02115, USA. ckbjimmy@mit.edu.
  • Kavishwar B Wagholikar
    Laboratory of Computer Science, Massachusetts General Hospital, 50 Staniford Street, Suite 750, Boston, MA, 02114, USA.
  • Alexa T McCray
    Department of Biomedical Informatics, Harvard Medical School, 10 Shattuck Street, 4th Floor, Boston, MA, 02115, USA.
  • Peter Szolovits
    Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
  • Henry C Chueh
    Laboratory of Computer Science, Massachusetts General Hospital, 50 Staniford Street, Suite 750, Boston, MA, 02114, USA.