RNA-ligand interaction scoring via data perturbation and augmentation modeling.

Journal: Nature computational science
Published Date:

Abstract

Despite recent advances in RNA-targeting drug discovery, the development of data-driven deep learning models remains challenging owing to limited validated RNA-small molecule interaction data and scarce known RNA structures. In this context, we introduce RNAsmol, a sequence-based deep learning framework that incorporates data perturbation with augmentation, graph-based molecular feature representation and attention-based feature fusion modules to predict RNA-small molecule interactions. RNAsmol employs perturbation strategies to balance the bias between the true negative and unknown interaction space, thereby elucidating the intrinsic binding patterns between RNA and small molecules. The resulting model demonstrates accurate predictions of the binding between RNA and small molecules, outperforming other methods in ten-fold cross-validation, unseen evaluation and decoy evaluation. Moreover, we use case studies to visualize molecular binding profiles and the distribution of learned weights, providing interpretable insights into RNAsmol's predictions. In particular, without requiring structural input, RNAsmol can generate reliable predictions and be adapted to various drug design scenarios.

Authors

  • Hongli Ma
    MOE Key Laboratory of Bioinformatics, State Key Lab of Green Biomanufacturing, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
  • Letian Gao
    MOE Key Laboratory of Bioinformatics, State Key Lab of Green Biomanufacturing, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.
  • Yunfan Jin
    MOE Key Laboratory of Bioinformatics, State Key Lab of Green Biomanufacturing, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.
  • Jianwei Ma
    School of Information Engineering, Henan University of Science and Technology, Luoyang 471023, Henan, China.
  • Yilan Bai
    MOE Key Laboratory of Bioinformatics, State Key Lab of Green Biomanufacturing, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China.
  • Xiaofan Liu
    1State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei Province, China; email: daohongjiang@mail.hzau.edu.cn, jiasencheng@mail.hzau.edu.cn.
  • Pengfei Bao
    MOE Key Laboratory of Bioinformatics, State Key Lab of Green Biomanufacturing, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China.
  • Ke Liu
    State Key Laboratory of Stress Cell Biology, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, P.R. China.
  • Zhenjiang Zech Xu
    State Key Laboratory of Food Science and Resources, Nanchang University, Nanchang, China. zhenjiang.xu@gmail.com.
  • Zhi John Lu
    MOE Key Laboratory of Bioinformatics, State Key Lab of Green Biomanufacturing, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing, 100084, China. zhilu@tsinghua.edu.cn.

Keywords

No keywords available for this article.