Sparsity is All You Need: Rethinking Biological Pathway-Informed Approaches in Deep Learning
Journal:
arXiv
Published Date:
May 7, 2025
Abstract
Biologically-informed neural networks typically leverage pathway annotations
to enhance performance in biomedical applications. We hypothesized that the
benefits of pathway integration does not arise from its biological relevance,
but rather from the sparsity it introduces. We conducted a comprehensive
analysis of all relevant pathway-based neural network models for predictive
tasks, critically evaluating each study's contributions. From this review, we
curated a subset of methods for which the source code was publicly available.
The comparison of the biologically informed state-of-the-art deep learning
models and their randomized counterparts showed that models based on randomized
information performed equally well as biologically informed ones across
different metrics and datasets. Notably, in 3 out of the 15 analyzed models,
the randomized versions even outperformed their biologically informed
counterparts. Moreover, pathway-informed models did not show any clear
advantage in interpretability, as randomized models were still able to identify
relevant disease biomarkers despite lacking explicit pathway information. Our
findings suggest that pathway annotations may be too noisy or inadequately
explored by current methods. Therefore, we propose a methodology that can be
applied to different domains and can serve as a robust benchmark for
systematically comparing novel pathway-informed models against their randomized
counterparts. This approach enables researchers to rigorously determine whether
observed performance improvements can be attributed to biological insights.