A novel coarsened graph learning method for scalable single-cell data analysis.

Journal: Computers in biology and medicine
PMID:

Abstract

The emergence of single-cell technologies, including flow and mass cytometry, as well as single-cell RNA sequencing, has revolutionized the study of cellular heterogeneity, generating vast datasets rich in biological insights. Despite the effectiveness of graph-based analyses in deciphering the complexities of these datasets, managing large-scale graph representations of single-cell data remains computationally challenging. Coarsening has been employed to tackle this difficulty. However, current coarsening techniques such as Cytocoarsening, Heavy Edge Matching (HEM), and Locally Variable Edges (LVE) often suffer from slow processing speeds and limited adaptability. To address these challenges, we propose a novel approach utilizing Feature-Aware Graph Coarsening via Hashing (FACH), which integrates locality-sensitive hashing for scalable and efficient single-cell data analysis. This method directly extracts informative, low-dimensional cell representations from raw single-cell RNA sequencing and mass cytometry data, significantly improving processing speed while preserving essential data features. We demonstrate its effectiveness in downstream tasks, such as scalable graph neural network training on coarsened single-cell data, highlighting its ability to retain crucial biological features and enable efficient, accurate analyses. Our method directly extracts informative, low-dimensional cell representations from raw single-cell RNA sequencing and mass cytometry data, significantly improving processing speed and preserving critical biological features, such as transcriptional signatures and network topology. It reduces computational time by at least 50% compared to existing methods and achieves superior classification accuracy, such as 88.1% on the Baron Human dataset, underscoring its efficiency and precision in large-scale single-cell analysis.

Authors

  • Mohit Kataria
    Yardi School of Artificial Intelligence, Indian Institute of Technology (IIT) Delhi, New Delhi, India. Electronic address: Mohit.Kataria@scai.iitd.ac.in.
  • Ekta Srivastava
    Department Electrical Engineering, Indian Institute of Technology (IIT) Delhi, New Delhi, India.
  • Kumar Arjun
    Department of Mathematics, Indian Institute of Technology (IIT) Delhi, New Delhi, India.
  • Sandeep Kumar
    Cellon S.A., ZAE Robert Steichen, 16 rue Hèierchen, L-4940, Bascharage, Luxembourg.
  • Ishaan Gupta
    Indian Institute of Technology (IIT), Delhi, India.
  • Jayadeva
    Department of Electrical Engineering, Indian Institute of Technology, Delhi, India. Electronic address: jayadeva@ee.iitd.ac.in.