iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks.

Journal: BMC genomics
Published Date:

Abstract

BACKGROUND: Enhancers are non-coding DNA fragments which are crucial in gene regulation (e.g. transcription and translation). Having high locational variation and free scattering in 98% of non-encoding genomes, enhancer identification is, therefore, more complicated than other genetic factors. To address this biological issue, several in silico studies have been done to identify and classify enhancer sequences among a myriad of DNA sequences using computational advances. Although recent studies have come up with improved performance, shortfalls in these learning models still remain. To overcome limitations of existing learning models, we introduce iEnhancer-ECNN, an efficient prediction framework using one-hot encoding and k-mers for data transformation and ensembles of convolutional neural networks for model construction, to identify enhancers and classify their strength. The benchmark dataset from Liu et al.'s study was used to develop and evaluate the ensemble models. A comparative analysis between iEnhancer-ECNN and existing state-of-the-art methods was done to fairly assess the model performance.

Authors

  • Quang H Nguyen
    School of Information and Communication Technology, Hanoi University of Science and Technology, 1 Dai Co Viet Road, Hanoi 100000, Vietnam.
  • Thanh-Hoang Nguyen-Vo
    School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand.
  • Nguyen Quoc Khanh Le
    In-Service Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei 110, Taiwan; AIBioMed Research Group, Taipei Medical University, Taipei 110, Taiwan; Translational Imaging Research Center, Taipei Medical University Hospital, Taipei 110, Taiwan. Electronic address: khanhlee@tmu.edu.tw.
  • Trang T T Do
    School of Business and Information Technology, Wellington Institute of Technology, 21 Kensington Avenue, Lower Hutt 5012, New Zealand.
  • Susanto Rahardja
    School of Marine Science and Technology, Northwestern Polytechnical University, 127 West Youyi Road, Xi'an, 710072, China. susantorahardja@ieee.org.
  • Binh P Nguyen
    School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand.