UnPaSt: unsupervised patient stratification by differentially expressed biclusters in omics data
Journal:
arXiv
Published Date:
Jul 31, 2024
Abstract
Most complex diseases, including cancer and non-malignant diseases like
asthma, have distinct molecular subtypes that require distinct clinical
approaches. However, existing computational patient stratification methods have
been benchmarked almost exclusively on cancer omics data and only perform well
when mutually exclusive subtypes can be characterized by many biomarkers. Here,
we contribute with a massive evaluation attempt, quantitatively exploring the
power of 22 unsupervised patient stratification methods using both, simulated
and real transcriptome data. From this experience, we developed UnPaSt
(https://apps.cosy.bio/unpast/) optimizing unsupervised patient stratification,
working even with only a limited number of subtype-predictive biomarkers. We
evaluated all 23 methods on real-world breast cancer and asthma transcriptomics
data. Although many methods reliably detected major breast cancer subtypes,
only few identified Th2-high asthma, and UnPaSt significantly outperformed its
closest competitors in both test datasets. Essentially, we showed that UnPaSt
can detect many biologically insightful and reproducible patterns in omic
datasets.