Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning.

Journal: International journal of molecular sciences

Published Date: Dec 7, 2022

Abstract

N6-methyladenosine (mA) is the most abundant within eukaryotic messenger RNA modification, which plays an essential regulatory role in the control of cellular functions and gene expression. However, it remains an outstanding challenge to detect mRNA mA transcriptome-wide at base resolution via experimental approaches, which are generally time-consuming and expensive. Developing computational methods is a good strategy for accurate in silico detection of mA modification sites from the large amount of RNA sequence data. Unfortunately, the existing computational models are usually only for mA site prediction in a single species, without considering the tissue level of species, while most of them are constructed based on low-confidence level data generated by an mA antibody immunoprecipitation (IP)-based sequencing method, thereby restricting reliability and generalizability of proposed models. Here, we review recent advances in computational prediction of mA sites and construct a new computational approach named im6APred using ensemble deep learning to accurately identify mA sites based on high-confidence level data in multiple tissues of mammals. Our model im6APred builds upon a comprehensive evaluation of multiple classification methods, including four traditional classification algorithms and three deep learning methods and their ensembles. The optimal base-classifier combinations are then chosen by five-fold cross-validation test to achieve an effective stacked model. Our model im6APred can produce the area under the receiver operating characteristic curve (AUROC) in the range of 0.82-0.91 on independent tests, indicating that our model has the ability to learn general methylation rules on RNA bases and generalize to mA transcriptome-wide identification. Moreover, AUROCs in the range of 0.77-0.96 were achieved using cross-species/tissues validation on the benchmark dataset, demonstrating differences in predictive performance at the tissue level and the need for constructing tissue-specific models for mA site prediction.

Authors

Zhengtao Luo

School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui, China.
Liliang Lou

Computer Department, Jingdezhen Ceramic University, Jingdezhen 333403, China.
Wangren Qiu

Computer Department, Jingdezhen Ceramic Institute, Jingdezhen 333046, China.
Zhaochun Xu

Computer Department, Jingdezhen Ceramic University, Jingdezhen, China.
Xuan Xiao

Computer Department, Jingdezhen Ceramic Institute, Jingdezhen 333046, China.

Keywords

Adenosine Animals Computational Biology Deep Learning Mammals Reproducibility of Results RNA

External Resources

View on PubMed Access via DOI PubMed (36555143)

Predicting N6-Methyladenosine Sites in Multiple Tissues of Mammals through Ensemble Deep Learning.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals