Informed, but Not Always Improved: Challenging the Benefit of Background Knowledge in GNNs

Journal: arXiv

Published Date: May 16, 2025

Abstract

In complex and low-data domains such as biomedical research, incorporating background knowledge (BK) graphs, such as protein-protein interaction (PPI) networks, into graph-based machine learning pipelines is a promising research direction. However, while BK is often assumed to improve model performance, its actual contribution and the impact of imperfect knowledge remain poorly understood. In this work, we investigate the role of BK in an important real-world task: cancer subtype classification. Surprisingly, we find that (i) state-of-the-art GNNs using BK perform no better than uninformed models like linear regression, and (ii) their performance remains largely unchanged even when the BK graph is heavily perturbed. To understand these unexpected results, we introduce an evaluation framework, which employs (i) a synthetic setting where the BK is clearly informative and (ii) a set of perturbations that simulate various imperfections in BK graphs. With this, we test the robustness of BK-aware models in both synthetic and real-world biomedical settings. Our findings reveal that careful alignment of GNN architectures and BK characteristics is necessary but holds the potential for significant performance improvements.

Authors

Kutalmış Coşkun
Ivo Kavisanczki
Amin Mirzaei
Tom Siegl
Bjarne C. Hiller
Stefan Lüdtke
Martin Becker

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2505.11023v1)

Informed, but Not Always Improved: Challenging the Benefit of Background Knowledge in GNNs

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Informed, but Not Always Improved: Challenging the Benefit of Background Knowledge in GNNs

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals