HeadTailTransfer: An efficient sampling method to improve the performance of graph neural network method in predicting sparse ncRNA-protein interactions.

Journal: Computers in biology and medicine

Published Date: Mar 15, 2023

Abstract

Noncoding RNA (ncRNA) is a functional RNA derived from DNA transcription, and most transcribed genes are transcribed into ncRNA. ncRNA is not directly involved in the translation of proteins, but it can participate in gene expression in cells and affect protein synthesis, thus playing an important role in biological processes such as growth, proliferation, metabolism, and information transmission. Therefore, understanding the interaction between ncRNA and protein is the basis for studying ncRNA regulation of protein-related biological activities. However, it is very expensive and time-consuming to verify ncRNA-protein interaction through biological experiments, and prediction methods based on machine learning have been developed rapidly. Recently, the graph neural network model (GNN) stands out for its excellent performance, but lacks a general framework for predicting ncRNA-protein interactions. We propose a GNN-based framework to predict ncRNA-protein interactions, which can utilize topological structure information to complete prediction tasks faster and more accurately. Meanwhile, for some smaller datasets, many ncRNA nodes lack neighbor information, resulting in lower prediction accuracy. For some larger datasets, the long-tail distribution causes the prediction of the tail nodes (sparse nodes linking few neighbors) to be affected. Therefore, we propose a new sampling method named HeadTailTransfer to mitigate these effects. Experimental results illustrate the effectiveness of this method. Especially for task-specific prediction on the RPI369 dataset in the Graphsage-based neural network framework, the AUC and ACC values increased from 56.8% and 52.2% to 80.2% and 71.8%, respectively. Our data and codes are available: https://github.com/kkkayle/HeadTailTransfer.

Authors

Jinhang Wei

Wenzhou University of Technology, Wenzhou, 325000, China.
Linlin Zhuo

School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, Zhejiang 325035, China; College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China.
Shiyao Pan

Wenzhou University of Technology, Wenzhou, 325000, China.
Xinze Lian

Wenzhou University of Technology, Wenzhou, 325000, China.
Xiaojun Yao

Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, PR China.
Xiangzheng Fu

Keywords

Machine Learning Neural Networks, Computer Protein Binding Proteins RNA, Untranslated

External Resources

View on PubMed Access via DOI PubMed (36958237)

HeadTailTransfer: An efficient sampling method to improve the performance of graph neural network method in predicting sparse ncRNA-protein interactions.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals