Trade-offs between machine learning and deep learning for mental illness detection on social media.

Journal: Scientific reports
PMID:

Abstract

Social media platforms provide valuable insights into mental health trends by capturing user-generated discussions on conditions such as depression, anxiety, and suicidal ideation. Machine learning (ML) and deep learning (DL) models have been increasingly applied to classify mental health conditions from textual data, but selecting the most effective model involves trade-offs in accuracy, interpretability, and computational efficiency. This study evaluates multiple ML models, including logistic regression, random forest, and LightGBM, alongside DL architectures such as ALBERT and Gated Recurrent Units (GRUs), for both binary and multi-class classification of mental health conditions. Our findings indicate that ML and DL models achieve comparable classification performance on medium-sized datasets, with ML models offering greater interpretability through variable importance scores, while DL models are more robust to complex linguistic patterns. Additionally, ML models require explicit feature engineering, whereas DL models learn hierarchical representations directly from text. Logistic regression provides the advantage of capturing both positive and negative associations between features and mental health conditions, whereas tree-based models prioritize decision-making power through split-based feature selection. This study offers empirical insights into the advantages and limitations of different modeling approaches and provides recommendations for selecting appropriate methods based on dataset size, interpretability needs, and computational constraints.

Authors

  • Zhanyi Ding
    Center for Data Science, New York University, New York, USA.
  • Zhongyan Wang
    Center for Data Science, New York University, New York, USA.
  • Yeyubei Zhang
    School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, USA.
  • Yuchen Cao
    Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100144, China.
  • Yunchong Liu
    School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, USA.
  • Xiaorui Shen
    Department of EECS, University of California, Berkeley, Berkeley, USA.
  • Yexin Tian
    Khoury College of Computer Science, Northeastern University, Boston, USA.
  • Jianglai Dai
    College of Computing, Georgia Institute of Technology, Atlanta, USA.