An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data.

Journal: Computational biology and chemistry

Published Date: Aug 24, 2021

Abstract

To explore the pathogenic mechanisms of MicroRNA (miRNA) on diverse diseases, many researchers have concentrated on discovering the potential associations between miRNA and disease using machine learning methods. However, the prediction accuracy of supervised machine learning methods is limited by lacking of experimentally-validated uncorrelated miRNA-disease pairs. Without these negative samples, training a highly accurate model is much more difficult. Different from traditional miRNA-disease prediction models using randomly selected unknown samples as negative training samples, we propose an ensemble learning framework to solve this positive-unlabeled (PU) learning problem. The framework incorporates two steps, i.e., a novel semi-supervised Kmeans (SS-Kmeans) to extract reliable negative samples from unknown miRNA-disease pairs and subagging method to generate diverse training sample sets to make full use of those reliable negative samples for ensemble learning. Combined with effective random vector functional link (RVFL) network as prediction model, the proposed framework showed superior prediction accuracy comparing with other popular approaches. A case study on lung and gastric neoplasms further confirms the framework's efficacy at identifying miRNA disease associations.

Authors

Yao Wu
Donghua Zhu

School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China.
Xuefeng Wang

Department of Advanced Manufacturing and Robotics, College of Engineering, Peking University, Beijing 100871, China.
Shuo Zhang

Ph.D. Program in Computer Science, The City University of New York, New York, NY, United States.

Keywords

Computational Biology Humans Lung Neoplasms MicroRNAs Stomach Neoplasms Supervised Machine Learning

External Resources

View on PubMed Access via DOI PubMed (34534906)

An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals