Time series-based hybrid ensemble learning model with multivariate multidimensional feature coding for DNA methylation prediction.

Journal: BMC genomics
Published Date:

Abstract

BACKGROUND: DNA methylation is a form of epigenetic modification that impacts gene expression without modifying the DNA sequence, thereby exerting control over gene function and cellular development. The prediction of DNA methylation is vital for understanding and exploring gene regulatory mechanisms. Currently, machine learning algorithms are primarily used for model construction. However, several challenges remain to be addressed, including limited prediction accuracy, constrained generalization capability, and insufficient learning capacity.

Authors

  • Wu Yan
    Guangdong Key Laboratory of Modern Control Technology, Institute of Intelligent Manufacturing, Guangdong Academy of Sciences, Guangzhou, China.
  • Li Tan
    Joint Shantou International Eye Centre of Shantou University and The Chinese University of Hong Kong, Shantou, Guangdong, China.
  • Li Mengshan
    College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China. msli@gnnu.edu.cn.
  • Zhou Weihong
    School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, Jiangsu, 212018, China.
  • Sheng Sheng
    School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, Jiangsu, 212018, China.
  • Wang Jun
    Department of Thoracic Surgery, The Second Hospital Affiliated to Harbin Medical University, #148 Baojian Road, Harbin, 150001, China.
  • Wu Fu-An
    School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, Jiangsu, 212018, China. fuan_w@just.edu.cn.