Interdisciplinary approach to identify language markers for post-traumatic stress disorder using machine learning and deep learning.

Journal: Scientific reports
Published Date:

Abstract

Post-traumatic stress disorder (PTSD) lacks clear biomarkers in clinical practice. Language as a potential diagnostic biomarker for PTSD is investigated in this study. We analyze an original cohort of 148 individuals exposed to the November 13, 2015, terrorist attacks in Paris. The interviews, conducted 5-11 months after the event, include individuals from similar socioeconomic backgrounds exposed to the same incident, responding to identical questions and using uniform PTSD measures. Using this dataset to collect nuanced insights that might be clinically relevant, we propose a three-step interdisciplinary methodology that integrates expertise from psychiatry, linguistics, and the Natural Language Processing (NLP) community to examine the relationship between language and PTSD. The first step assesses a clinical psychiatrist's ability to diagnose PTSD using interview transcription alone. The second step uses statistical analysis and machine learning models to create language features based on psycholinguistic hypotheses and evaluate their predictive strength. The third step is the application of a hypothesis-free deep learning approach to the classification of PTSD in our cohort. Results show that the clinical psychiatrist achieved a diagnosis of PTSD with an AUC of 0.72. This is comparable to a gold standard questionnaire (Area Under Curve (AUC) ≈ 0.80). The machine learning model achieved a diagnostic AUC of 0.69. The deep learning approach achieved an AUC of 0.64. An examination of model error informs our discussion. Importantly, the study controls for confounding factors, establishes associations between language and DSM-5 subsymptoms, and integrates automated methods with qualitative analysis. This study provides a direct and methodologically robust description of the relationship between PTSD and language. Our work lays the groundwork for advancing early and accurate diagnosis and using linguistic markers to assess the effectiveness of pharmacological treatments and psychotherapies.

Authors

  • Robin Quillivic
    PSL-EPHE, Paris, France. robin.quillivic@ephe.psl.eu.
  • Frédérique Gayraud
    Laboratoire dynamique du langage, UMR 5596, CNRS, université ́ Lyon-II, Lyon, France.
  • Yann Auxéméry
    Centre Hospitalier de Jury-les-Metz, centre de réhabilitation pour adultes, Metz, France.
  • Laurent Vanni
    CNRS, UMR 7320 : Bases, Corpus, Langage, Nice, France.
  • Denis Peschanski
    Université PARIS 1 Panthéon-Sorbonne, Paris, France.
  • Francis Eustache
    PSL-EPHE, Paris, France.
  • Jacques Dayan
    PSL-EPHE, Paris, France.
  • Salma Mesmoudi
    Sorbonnes University Paris 1, MATRICE Project, ISC-PIF, 113, rue Nationale, 75013 Paris, France. Electronic address: salma.mesmoudi@iscpif.fr.