A novel self-supervised graph clustering method with reliable semi-supervision.
Journal:
Neural networks : the official journal of the International Neural Network Society
PMID:
40120553
Abstract
Cluster analysis, as a core technique in unsupervised learning, has widespread applications. With the increasing complexity of data, deep clustering, which integrates the advantages of deep learning and traditional clustering algorithms, demonstrates outstanding performance in processing high-dimensional and complex data. However, when applied to graph data, deep clustering faces two major challenges: noise and sparsity. Noise introduces misleading connections, while sparsity makes it difficult to accurately capture relationships between nodes. These two issues not only increase the difficulty of feature extraction but also significantly affect clustering performance. To address these problems, we propose a novel Self-Supervised Graph Clustering model based on Reliable Semi-Supervision (SSGC-RSS). This model innovates through upstream and downstream components. The upstream component employs a dual-decoder graph autoencoder with joint clustering optimization, preserving latent information of features and graph structure, and alleviates the sparsity problem by generating cluster centers and pseudo-labels. The downstream component utilizes a semi-supervised graph attention encoding network based on highly reliable samples and their pseudo-labels to select reliable samples for training, thereby effectively reducing the interference of noise. Experimental results on multiple graph datasets demonstrate that, compared to existing methods, SSGC-RSS achieves significant performance improvements, with accuracy improvements of 0.9%, 2.0%, and 5.6% on Cora, Citeseer, and Pubmed datasets respectively, proving its effectiveness and superiority in complex graph data clustering tasks.