Optimizing depression detection in clinical doctor-patient interviews using a multi-instance learning framework.
Journal:
Scientific reports
PMID:
39994325
Abstract
In recent years, the number of people suffering from depression has gradually increased, and early detection is of great significance for the well-being of the public. However, the current methods for detecting depression are relatively limited, typically relying on the self-rating depression scale (SDS) and interviews. These methods are influenced by subjective or environmental factors. To improve the objectivity and efficiency of diagnosis, deep learning techniques have been applied to the field of automatic depression detection (ADD), providing a more accurate and objective approach. During interviews, transcribed interview data is one of the most commonly used modalities in ADD. However, previous studies have only utilized response texts or selected question-answer pairs, resulting in information redundancy and loss. This paper is the first to apply the multiple instance learning (MIL) framework to the field of textual interview data, aiming to overcome issues of inadequate text representation and ineffective information extraction in long texts. In the MIL framework, each instance undergoes an independent feature extraction process, ensuring that the local features of each instance are fully captured. This not only enhances the overall text representation capability but also alleviates the issue of sample imbalance in the dataset. Additionally, this paper improves upon previous aggregation strategies by introducing two hyper-parameters to accommodate the uncertainties in the field of text sentiment. An ensemble model of MT5 and RoBERTa (referred to as multi-MTRB) was constructed to extract features from each instance and output confidence scores indicating the presence of depressive information in the instances. Due to the unique design of the MIL framework, the proposed method is highly interpretable and is able to identify specific sentences that identify people from depressed patients, while introducing LIME techniques to provide more in-depth interpretation of negative instance sentences. This provides a promising approach for depression detection in the context of text interview data patterns. We evaluated the proposed method on DAIC-WOZ and E-DAIC datasets with excellent results. The F1 score is 0.88 on the DAIC-WOZ dataset and 0.86 on the E-DAIC dataset.