DeepHPV: a deep learning model to predict human papillomavirus integration sites.

Journal: Briefings in bioinformatics
PMID:

Abstract

Human papillomavirus (HPV) integrating into human genome is the main cause of cervical carcinogenesis. HPV integration selection preference shows strong dependence on local genomic environment. Due to this theory, it is possible to predict HPV integration sites. However, a published bioinformatic tool is not available to date. Thus, we developed an attention-based deep learning model DeepHPV to predict HPV integration sites by learning environment features automatically. In total, 3608 known HPV integration sites were applied to train the model, and 584 reviewed HPV integration sites were used as the testing dataset. DeepHPV showed an area under the receiver-operating characteristic (AUROC) of 0.6336 and an area under the precision recall (AUPR) of 0.5670. Adding RepeatMasker and TCGA Pan Cancer peaks improved the model performance to 0.8464 and 0.8501 in AUROC and 0.7985 and 0.8106 in AUPR, respectively. Next, we tested these trained models on independent database VISDB and found the model adding TCGA Pan Cancer performed better (AUROC: 0.7175, AUPR: 0.6284) than the model adding RepeatMasker peaks (AUROC: 0.6102, AUPR: 0.5577). Moreover, we introduced attention mechanism in DeepHPV and enriched the transcription factor binding sites including BHLHA15, CHR, COUP-TFII, DMRTA2, E2A, HIC1, INR, NPAS, Nr5a2, RARa, SCL, Snail1, Sox10, Sox3, Sox4, Sox6, STAT6, Tbet, Tbx5, TEAD, Tgif2, ZNF189, ZNF416 near attention intensive sites. Together, DeepHPV is a robust and explainable deep learning model, providing new insights into HPV integration preference and mechanism. Availability: DeepHPV is available as an open-source software and can be downloaded from https://github.com/JiuxingLiang/DeepHPV.git, Contact: huzheng1998@163.com, liangjiuxing@m.scnu.edu.cn, lizheyzy@163.com.

Authors

  • Rui Tian
    Department of Obstetrics and Gynecology, Precision Medicine Institute, Sun Yat-sen University, Yuexiu, Guangzhou, Guangdong, China.
  • Ping Zhou
  • Mengyuan Li
    State Key Laboratory, Integrated Services Networks, Xidian University, 710071, Xi'an, China.
  • Jinfeng Tan
    First Affiliated Hospital, Sun Yat-sen University.
  • Zifeng Cui
    Department of Obstetrics and Gynecology, Precision Medicine Institute, Sun Yat-sen University, Yuexiu, Guangzhou, Guangdong, China.
  • Wei Xu
    College of Food and Bioengineering, Henan University of Science and Technology, Luoyang, 471023 China.
  • Jingyue Wei
    Department of Obstetrics and Gynecology at the First Affiliated Hospital, Sun Yat-sen University.
  • Jingjing Zhu
    Department of Obstetrics and Gynecology of the First Affiliated Hospital, Sun Yat-sen University.
  • Zhuang Jin
    Department of Obstetrics and Gynecology, Precision Medicine Institute, Sun Yat-sen University, Yuexiu, Guangzhou, Guangdong, China.
  • Chen Cao
    Department of Neurology, The First Affiliated Hospital, Sun Yat-sen University, Yuexiu, Guangzhou, Guangdong, China.
  • Weiwen Fan
    Department of Obstetrics and Gynecology, Precision Medicine Institute, Sun Yat-sen University, Yuexiu, Guangzhou, Guangdong, China.
  • Weiling Xie
    Department of Obstetrics and Gynecology, Precision Medicine Institute, Sun Yat-sen University, Yuexiu, Guangzhou, Guangdong, China.
  • Zhaoyue Huang
    Department of Obstetrics and Gynecology, Precision Medicine Institute, Sun Yat-sen University, Yuexiu, Guangzhou, Guangdong, China.
  • Hongxian Xie
    Department of Neurology, The First Affiliated Hospital, Sun Yat-sen University, Yuexiu, Guangzhou, Guangdong, China.
  • Zeshan You
    First Affiliated Hospital, Sun Yat-sen University.
  • Gang Niu
  • Canbiao Wu
    Institute for Brain Research and Rehabilitation, South China Normal University, Guangzhou, 510631, Guangdong, China.
  • Xiaofang Guo
    Department of Medical Oncology of the Eastern Hospital, the First Affiliated Hospital, Sun Yat-Sen University, Guangdong, 510700, Guangzhou, China.
  • Xuchu Weng
    Institute for Brain Research and Rehabilitation, South China Normal University, Guangzhou, 510631, Guangdong, China.
  • Xun Tian
    Department of Obstetrics and Gynecology, Academician Expert Workstation, The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
  • Fubing Yu
    Dongguan Maternal and Child Health Care Hospital.
  • Zhiying Yu
    Department of Gynecology, Shenzhen Second People's Hospital/the First Affiliated Hospital of Shenzhen University Health Science Center.
  • Jiuxing Liang
    Institute for Brain Research and Rehabilitation, South China Normal University, Guangzhou, 510631, Guangdong, China. liangjiuxing@m.scnu.edu.cn.
  • Zheng Hu
    Department of Obstetrics and Gynecology, Precision Medicine Institute, Sun Yat-sen University, Yuexiu, Guangzhou, Guangdong, China.