Synthetic augmentation of cancer cell line multi-omic datasets using unsupervised deep learning.

Journal: Nature communications
Published Date:

Abstract

Integrating diverse types of biological data is essential for a holistic understanding of cancer biology, yet it remains challenging due to data heterogeneity, complexity, and sparsity. Addressing this, our study introduces an unsupervised deep learning model, MOSA (Multi-Omic Synthetic Augmentation), specifically designed to integrate and augment the Cancer Dependency Map (DepMap). Harnessing orthogonal multi-omic information, this model successfully generates molecular and phenotypic profiles, resulting in an increase of 32.7% in the number of multi-omic profiles and thereby generating a complete DepMap for 1523 cancer cell lines. The synthetically enhanced data increases statistical power, uncovering less studied mechanisms associated with drug resistance, and refines the identification of genetic associations and clustering of cancer cell lines. By applying SHapley Additive exPlanations (SHAP) for model interpretation, MOSA reveals multi-omic features essential for cell clustering and biomarker identification related to drug and gene dependencies. This understanding is crucial for developing much-needed effective strategies to prioritize cancer targets.

Authors

  • Zhaoxiang Cai
    ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia.
  • Sofia Apolinário
    INESC-ID, 1000-029, Lisboa, Portugal.
  • Ana R Baião
    INESC-ID, 1000-029, Lisboa, Portugal.
  • Clare Pacini
    Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, CB10 1SA, UK.
  • Miguel D Sousa
    INESC-ID, 1000-029, Lisboa, Portugal.
  • Susana Vinga
    IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal.
  • Roger R Reddel
    ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia.
  • Phillip J Robinson
    ProCan®, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, NSW, Australia.
  • Mathew J Garnett
    Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, CB10 1SA, UK.
  • Qing Zhong
  • Emanuel Gonçalves
    INESC-ID, 1000-029, Lisboa, Portugal. emanuel.v.goncalves@tecnico.ulisboa.pt.