Improving Layman Readability of Clinical Narratives with Unsupervised Synonym Replacement.

Journal: Studies in health technology and informatics
Published Date:

Abstract

We report on the development and evaluation of a prototype tool aimed to assist laymen/patients in understanding the content of clinical narratives. The tool relies largely on unsupervised machine learning applied to two large corpora of unlabeled text - a clinical corpus and a general domain corpus. A joint semantic word-space model is created for the purpose of extracting easier to understand alternatives for words considered difficult to understand by laymen. Two domain experts evaluate the tool and inter-rater agreement is calculated. When having the tool suggest ten alternatives to each difficult word, it suggests acceptable lay words for 55.51% of them. This and future manual evaluation will serve to further improve performance, where also supervised machine learning will be used.

Authors

  • Hans Moen
    Turku NLP Group, Department of Future Technologies, University of Turku, Finland.
  • Laura-Maria Peltonen
    Nursing Science, University of Turku, and Turku University Hospital, Turku, Finland.
  • Mikko Koivumäki
    Department of Nursing Science, University of Turku, Finland.
  • Henry Suhonen
    Department of Nursing Science, University of Turku, Finland.
  • Tapio Salakoski
    TurkuNLP group, Department of Future Technologies, University of Turku, Turku, Finland.
  • Filip Ginter
    Department of Information Technology, University of Turku, Turku, Finland.
  • Sanna Salanterä
    Nursing Science, University of Turku, and Turku University Hospital, Turku, Finland.