CellMemory: hierarchical interpretation of out-of-distribution cells using bottlenecked transformer.

Journal: Genome biology
Published Date:

Abstract

Machine learning methods, especially Transformer architectures, have been widely employed in single-cell omics studies. However, interpretability and accurate representation of out-of-distribution (OOD) cells remains challenging. Inspired by the global workspace theory in cognitive neuroscience, we introduce CellMemory, a bottlenecked Transformer with improved generalizability designed for the hierarchical interpretation of OOD cells. Without pre-training, CellMemory outperforms existing single-cell foundation models and accurately deciphers spatial transcriptomics at high resolution. Leveraging its robust representations, we further elucidate malignant cells and their founder cells across patients, providing reliable characterizations of the cellular changes caused by the disease.

Authors

  • Qifei Wang
    China National Center for Bioinformation, Beijing, 100101, China.
  • He Zhu
    State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China.
  • Yiwen Hu
    Laboratory for Atomistic and Molecular Mechanics, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, Massachusetts 02139, United States.
  • Yanjie Chen
    State Key Laboratory of Digital Medical Engineering, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, China.
  • Yuwei Wang
    College of Pharmacy, Shaanxi University of Chinese Medicine, Xianyang, 712000, PR China.
  • Guochao Li
    China National Center for Bioinformation, Beijing, 100101, China.
  • Yun Li
    School of Public Health, University of Michigan, Ann Arbor, MI, USA.
  • Jinfeng Chen
    Department of Endocrinology, Shenzhen Children's Hospital, No. 7019, Yitian Road, Futian District, Shenzhen, 518038, Guangdong Province, People's Republic of China.
  • Xuegong Zhang
    MOE Key Laboratory of Bioinformatics and Bioinformatics Division of BNRIST, Department of Automation, Tsinghua University, Beijing 100084, China.
  • James Zou
    Department of Biomedical Data Science, Stanford University, Stanford, California.
  • Manolis Kellis
    Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
  • Yue Li
    School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, China.
  • Dianbo Liu
    Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge.
  • Lan Jiang
    Institute of Bismuth Science, University of Shanghai for Science and Technology Shanghai 200093 P. R. China ouyangrz@usst.edu.cn.