Generative artificial intelligence performs rudimentary structural biology modeling.

Journal: Scientific reports
PMID:

Abstract

Natural language-based generative artificial intelligence (AI) has become increasingly prevalent in scientific research. Intriguingly, capabilities of generative pre-trained transformer (GPT) language models beyond the scope of natural language tasks have recently been identified. Here we explored how GPT-4 might be able to perform rudimentary structural biology modeling. We prompted GPT-4 to model 3D structures for the 20 standard amino acids and an α-helical polypeptide chain, with the latter incorporating Wolfram mathematical computation. We also used GPT-4 to perform structural interaction analysis between the anti-viral nirmatrelvir and its target, the SARS-CoV-2 main protease. Geometric parameters of the generated structures typically approximated close to experimental references. However, modeling was sporadically error-prone and molecular complexity was not well tolerated. Interaction analysis further revealed the ability of GPT-4 to identify specific amino acid residues involved in ligand binding along with corresponding bond distances. Despite current limitations, we show the current capacity of natural language generative AI to perform basic structural biology modeling and interaction analysis with atomic-scale accuracy.

Authors

  • Alexander M Ille
    School of Graduate Studies, Rutgers University, Newark, NJ, USA. Electronic address: mai86@gsbs.rutgers.edu.
  • Christopher Markosian
    1Department of Neurological Surgery, Rutgers New Jersey Medical School, Newark, New Jersey.
  • Stephen K Burley
    Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California San Diego, San Diego, CA, United States.
  • Michael B Mathews
    School of Graduate Studies, Rutgers University, Newark, NJ, USA; Department of Medicine, Rutgers New Jersey Medical School, Newark, NJ, USA. Electronic address: mathews@njms.rutgers.edu.
  • Renata Pasqualini
    From the Research Collaboratory for Structural Bioinformatics Protein Data Bank, the Institute for Quantitative Biomedicine, and the Department of Chemistry and Chemical Biology, Rutgers, the State University of New Jersey (S.K.B.), and the Rutgers Cancer Institute of New Jersey, New Brunswick (S.K.B.) and Newark (W.A., R.P.); and the Division of Hematology-Oncology, Department of Medicine (W.A.), and the Division of Cancer Biology, Department of Radiation Oncology (R.P.), Rutgers New Jersey Medical School, Newark; and the Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, San Diego (S.K.B.).
  • Wadih Arap
    From the Research Collaboratory for Structural Bioinformatics Protein Data Bank, the Institute for Quantitative Biomedicine, and the Department of Chemistry and Chemical Biology, Rutgers, the State University of New Jersey (S.K.B.), and the Rutgers Cancer Institute of New Jersey, New Brunswick (S.K.B.) and Newark (W.A., R.P.); and the Division of Hematology-Oncology, Department of Medicine (W.A.), and the Division of Cancer Biology, Department of Radiation Oncology (R.P.), Rutgers New Jersey Medical School, Newark; and the Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, San Diego (S.K.B.).