Sequence based model using deep neural network and hybrid features for identification of 5-hydroxymethylcytosine modification.

Journal: Scientific reports
PMID:

Abstract

RNA modifications are pivotal in the development of newly synthesized structures, showcasing a vast array of alterations across various RNA classes. Among these, 5-hydroxymethylcytosine (5HMC) stands out, playing a crucial role in gene regulation and epigenetic changes, yet its detection through conventional methods proves cumbersome and costly. To address this, we propose Deep5HMC, a robust learning model leveraging machine learning algorithms and discriminative feature extraction techniques for accurate 5HMC sample identification. Our approach integrates seven feature extraction methods and various machine learning algorithms, including Random Forest, Naive Bayes, Decision Tree, and Support Vector Machine. Through K-fold cross-validation, our model achieved a notable 84.07% accuracy rate, surpassing previous models by 7.59%, signifying its potential in early cancer and cardiovascular disease diagnosis. This study underscores the promise of Deep5HMC in offering insights for improved medical assessment and treatment protocols, marking a significant advancement in RNA modification analysis.

Authors

  • Salman Khan
  • Islam Uddin
    Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan.
  • Mukhtaj Khan
    Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan.
  • Nadeem Iqbal
    Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan.
  • Huda M Alshanbari
    Department of Mathematical Sciences, College of Science, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia.
  • Bakhtiyar Ahmad
    Higher Education Department, Kabul, Afghanistan. mbakahmad82@gmail.com.
  • Dost Muhammad Khan
    Department of Statistics, Abdul Wali Khan University, Mardan.