Disentangling Symptom Heterogeneity in Large-Scale Psychiatric Text: Domain-Adapted vs. Instruction-Tuned Transformers

Journal: medRxiv
Published Date:

Abstract

Psychiatric disorders are fundamentally challenged by symptom heterogeneity, high comorbidity, and the absence of objective biomarkers, which together result in substantial variability in clinical assessment and treatment selection. Patient-generated language captures rich information about subjective experience and symptom severity, which can be systematically encoded and analyzed using computational models, making it a scalable signal for psychiatric assessment. We compare two approaches: (i) a domain-specialized transformer fine-tuned on clinical language, based on the Bio-ClinicalBERT encoder architecture, and (ii) a large-scale instruction-tuned generalist encoder (Instructor-XL) used as a frozen feature extractor with a shallow classification head. A corpus of = 151,228 de-identified texts was compiled from five public sources, covering four psychiatric phenotypes: anxiety, depression, schizophrenia, and suicidal intention. Models were evaluated using stratified 10-fold cross-validation with cost-sensitive training, prioritizing imbalance-aware metrics, including Macro-1 and Matthews Correlation Coefficient (MCC), over accuracy. Bio-ClinicalBERT achieved superior overall performance (Macro-1 = 0.78, MCC = 0.6752), indicating more reliable separation of diagnostically overlapping affective categories. In contrast, Instructor-XL achieved its highest class-specific performance for schizophrenia (1 = 0.798). Explainability analyses suggest that the domain-specialized model places greater weight on clinically relevant terms, whereas the generalist model relies on a broader set of lexical features.

Authors

  • Varone
  • G.; Kumar
  • P.; Brown
  • J.; Boulila
  • W.