Comparative Analysis of Multi-Omics Integration Using Advanced Graph Neural Networks for Cancer Classification
Journal:
arXiv
Published Date:
Oct 5, 2024
Abstract
Multi-omics data is increasingly being utilized to advance computational
methods for cancer classification. However, multi-omics data integration poses
significant challenges due to the high dimensionality, data complexity, and
distinct characteristics of various omics types. This study addresses these
challenges and evaluates three graph neural network architectures for
multi-omics (MO) integration based on graph-convolutional networks (GCN),
graph-attention networks (GAT), and graph-transformer networks (GTN) for
classifying 31 cancer types and normal tissues. To address the
high-dimensionality of multi-omics data, we employed LASSO (Least Absolute
Shrinkage and Selection Operator) regression for feature selection, leading to
the creation of LASSO-MOGCN, LASSO-MOGAT, and LASSO-MOTGN models. Graph
structures for the networks were constructed using gene correlation matrices
and protein-protein interaction networks for multi-omics integration of
messenger-RNA, micro-RNA, and DNA methylation data. Such data integration
enables the networks to dynamically focus on important relationships between
biological entities, improving both model performance and interpretability.
Among the models, LASSO-MOGAT with a correlation-based graph structure achieved
state-of-the-art accuracy (95.9%) and outperformed the LASSO-MOGCN and
LASSO-MOTGN models in terms of precision, recall, and F1-score. Our findings
demonstrate that integrating multi-omics data in graph-based architectures
enhances cancer classification performance by uncovering distinct molecular
patterns that contribute to a better understanding of cancer biology and
potential biomarkers for disease progression.