A novel self-supervised graph clustering method with reliable semi-supervision.

Journal: Neural networks : the official journal of the International Neural Network Society
PMID:

Abstract

Cluster analysis, as a core technique in unsupervised learning, has widespread applications. With the increasing complexity of data, deep clustering, which integrates the advantages of deep learning and traditional clustering algorithms, demonstrates outstanding performance in processing high-dimensional and complex data. However, when applied to graph data, deep clustering faces two major challenges: noise and sparsity. Noise introduces misleading connections, while sparsity makes it difficult to accurately capture relationships between nodes. These two issues not only increase the difficulty of feature extraction but also significantly affect clustering performance. To address these problems, we propose a novel Self-Supervised Graph Clustering model based on Reliable Semi-Supervision (SSGC-RSS). This model innovates through upstream and downstream components. The upstream component employs a dual-decoder graph autoencoder with joint clustering optimization, preserving latent information of features and graph structure, and alleviates the sparsity problem by generating cluster centers and pseudo-labels. The downstream component utilizes a semi-supervised graph attention encoding network based on highly reliable samples and their pseudo-labels to select reliable samples for training, thereby effectively reducing the interference of noise. Experimental results on multiple graph datasets demonstrate that, compared to existing methods, SSGC-RSS achieves significant performance improvements, with accuracy improvements of 0.9%, 2.0%, and 5.6% on Cora, Citeseer, and Pubmed datasets respectively, proving its effectiveness and superiority in complex graph data clustering tasks.

Authors

  • Weijia Lu
    School of Communication and Information Engineering, Shanghai University, Shanghai, China.
  • Min Wang
    National and Local Joint Engineering Research Center of Ecological Treatment Technology for Urban Water Pollution, Wenzhou University, Wenzhou 325035, China.
  • Yun Yu
    School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, 211166, China. yuyun@njmu.edu.cn.
  • Liang Ma
    College of Information and Management, National University of Defense Technology, Changsha 410073, China.
  • Yaxiang Shi
    Network Information Center, Zhongda Hospital Southeast University, Nanjing 210009, China.
  • Zhongqiu Huang
    Department Of Information, The First Affiliated Hospital with Nanjing Medical University, Nanjing, Jiangsu, 210029, China.
  • Ming Gong
    Department of Thoracic Surgery, Affiliated Hospital of Zunyi Medical University, Zunyi, China.