SCRIPT: Predicting Single-Cell Long-Range Cis-Regulation Based on Pretrained Graph Attention Networks.

Journal: Advanced science (Weinheim, Baden-Wurttemberg, Germany)
Published Date:

Abstract

Single-cell cis-regulatory relationships (CRRs) are essential for deciphering transcriptional regulation and understanding the pathogenic mechanisms of disease-associated non-coding variants. Existing computational methods struggle to accurately predict single-cell CRRs due to inadequately integrating causal biological principles and large-scale single-cell data. Here, SCRIPT (Single-cell Cis-regulatory Relationship Identifier based on Pre-Trained graph attention networks) is presented for inferring single-cell CRRs from transcriptomic and chromatin accessibility data. SCRIPT incorporates two key innovations: graph causal attention networks supported by empirical CRR evidence, and representation learning enhanced through pretraining on atlas-scale single-cell chromatin accessibility data. Validation using cell-type-specific chromatin contact and CRISPR perturbation data demonstrates that SCRIPT achieves a mean AUC of 0.89, significantly outperforming state-of-the-art methods (AUC: 0.7). Notably, SCRIPT obtains an over twofold improvement in predicting long-range CRRs (>100 Kb) compared to existing methods. By applying SCRIPT to Alzheimer's disease and schizophrenia, a framework is established for prioritizing disease-causing variants and elucidating their functional effects in a cell-type-specific manner. By uncovering molecular genetic mechanisms undetected by existing computational methods, SCRIPT provides a roadmap for advancing genetic diagnosis and target discovery.

Authors

  • Yu Zhang
    College of Marine Electrical Engineering, Dalian Maritime University, Dalian, China.
  • Baole Wen
    College of Medicine, Nankai University, Tianjin, 300350, China.
  • Yifeng Jiao
    Shanghai Academy of Artificial Intelligence for Science, Shanghai, 200232, China.
  • Yuchen Liu
    Department of Internal Medicine, Peking Union Medical College Hospital, Beijing, China.
  • Xin Guo
    Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong.
  • Yushuai Wu
    Shanghai Academy of Artificial Intelligence for Science, Shanghai, 200232, China.
  • Jiyang Li
    Department of Natural Medicine, School of Pharmacy, Fudan University Shanghai 201203 PR China jxiong@fudan.edu.cn jfhu@fudan.edu.cn.
  • Limei Han
    Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai, 200433, China.
  • Yinghui Xu
    Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai, 200433, China.
  • Xin Gao
    Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA.
  • Yuan Qi
    Fudan University, Shanghai, China. Electronic address: qiyuan@fudan.edu.cn.
  • Yuan Cheng
    Science Island Branch, University of Science and Technology of China, Hefei, Anhui, China.
  • Ying He
    Cancer Research Center Nantong, Affiliated Tumor Hospital of Nantong University, and Medical School of Nantong University, Nantong, China.
  • Weidong Tian
    State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Science, Fudan University, Shanghai 200433, PR China; Children's Hospital of Fudan University, Shanghai 200433, PR China. Electronic address: weidong.tian@fudan.edu.cn.

Keywords

No keywords available for this article.