ChatGPT Combining Machine Learning for the Prediction of Nanozyme Catalytic Types and Activities.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

The design of nanozymes with superior catalytic activities is a prerequisite for broadening their biomedical applications. Previous studies have exerted significant effort in theoretical calculation and experimental trials for enhancing the catalytic activity of nanozyme. Machine learning (ML) provides a forward-looking aid in predicting nanozyme catalytic activity. However, this requires a significant amount of human effort for data collection. In addition, the prediction accuracy urgently needs to be improved. Herein, we demonstrate that ChatGPT can collaborate with humans to efficiently collect data. We establish four qualitative models (random forest (RF), decision tree (DT), adaboost random forest (adaboost-RF), and adaboost decision tree (adaboost-DT)) for predicting nanozyme catalytic types, such as peroxidase, oxidase, catalase, superoxide dismutase, and glutathione peroxidase. Furthermore, we use five quantitative models (random forest (RF), decision tree (DT), Support Vector Regression (SVR), gradient boosting regression (GBR), and fully connected deep neuron network (DNN)) to predict nanozyme catalytic activities. We find that GBR model demonstrates superior prediction performance for nanozyme catalytic activities ( = 0.6476 for Km and = 0.95 for Kcat). Moreover, an open-access web resource, AI-ZYMES, with a ChatGPT-based nanozyme copilot is developed for predicting nanozyme catalytic types and activities and guiding the synthesis of nanozyme. The accuracy of the nanozyme copilot's responses reaches more than 90% through the retrieval augmented generation. This study provides a new potential application for ChatGPT in the field of nanozymes.

Authors

  • Liping Sun
    School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui 230012, China.
  • Jili Hu
    School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui 230012, China.
  • Yinfeng Yang
    School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui 230012, China.
  • Yongkang Wang
    College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.
  • Zijian Wang
    School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui 230012, China.
  • Yong Gao
    Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi, China.
  • Yiqi Nie
    School of Medical Information Engineering, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China.
  • Can Liu
    School of Medical Informatics Engineering, Anhui University of Chinese Medicine, Hefei, Anhui 230012, China.
  • Hongxing Kan
    School of Medical Information Engineering, Anhui University of Traditional Chinese Medicine, Hefei, Anhui, China.