Reconstructing intra-tumor fitness landscapes from scSeq CNA genotypes via simulation-based Bayesian inference and Deep Learning

Journal: bioRxiv
Published Date:

Abstract

Inferring the selective effects of copy-number alterations (CNAs) from clonal tumor data is essential for understanding tumor evolution. In practice, intra-tumor evolutionary parameters are typically estimated by fitting population genetic models to observed data using maximum likelihood or Bayesian methods. However, realistic mechanistic models often lead to intractable likelihoods, limiting the applicability of conventional inference approaches. Here, we introduce a likelihood-free, simulation-based framework for inferring intra-tumor selection coefficients directly from clonal CNA profiles. Our approach employs neural posterior estimation to amortize inference across simulated tumors and uses normalizing flows to flexibly parameterize high-dimensional posterior distributions while enabling robust uncertainty quantification. Our primary model, CloneMLP-NPE, learns representations of whole-tumor CNA genotypes using a multilayer perceptron (MLP)-based encoder. We compare this model against two baselines: (i) a Set Transformer encoder applied to the same whole-tumor CNA profiles, and (ii) a consensus-based approach that relies only on the CNA profile of the most abundant clone. On held-out simulations, CloneMLP-NPE achieves the strongest overall performance, yielding well-calibrated posterior distributions and more accurate posterior mean estimates than both baselines.

Authors

  • KafiKang
  • M.; Skums
  • P.

Categories