Modeling Gene Expression Distributional Shifts for Unseen Genetic Perturbations

Journal: arXiv

Published Date: Jul 1, 2025

Abstract

We train a neural network to predict distributional responses in gene expression following genetic perturbations. This is an essential task in early-stage drug discovery, where such responses can offer insights into gene function and inform target identification. Existing methods only predict changes in the mean expression, overlooking stochasticity inherent in single-cell data. In contrast, we offer a more realistic view of cellular responses by modeling expression distributions. Our model predicts gene-level histograms conditioned on perturbations and outperforms baselines in capturing higher-order statistics, such as variance, skewness, and kurtosis, at a fraction of the training cost. To generalize to unseen perturbations, we incorporate prior knowledge via gene embeddings from large language models (LLMs). While modeling a richer output space, the method remains competitive in predicting mean expression changes. This work offers a practical step towards more expressive and biologically informative models of perturbation effects.

Authors

Kalyan Ramakrishnan
Jonathan G. Hedley
Sisi Qu
Puneet K. Dokania
Philip H. S. Torr
Cesar A. Prada-Medina
Julien Fauqueur
Kaspar Martens

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2507.02980v1)

Modeling Gene Expression Distributional Shifts for Unseen Genetic Perturbations

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Modeling Gene Expression Distributional Shifts for Unseen Genetic Perturbations

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals