Using machine learning to predict the effects and consequences of mutations in proteins.

Journal: Current opinion in structural biology

Published Date: Jan 3, 2023

Abstract

Machine and deep learning approaches can leverage the increasingly available massive datasets of protein sequences, structures, and mutational effects to predict variants with improved fitness. Many different approaches are being developed, but systematic benchmarking studies indicate that even though the specifics of the machine learning algorithms matter, the more important constraint comes from the data availability and quality utilized during training. In cases where little experimental data are available, unsupervised and self-supervised pre-training with generic protein datasets can still perform well after subsequent refinement via hybrid or transfer learning approaches. Overall, recent progress in this field has been staggering, and machine learning approaches will likely play a major role in future breakthroughs in protein biochemistry and engineering.

Authors

Daniel J Diaz

Center for Systems and Synthetic Biology, Department of Chemistry, and Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX, USA.
Anastasiya V Kulikova

Center for Systems and Synthetic Biology and Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA.
Andrew D Ellington

Center for Systems and Synthetic Biology and Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX, USA.
Claus O Wilke

Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA. wilke@austin.utexas.edu.

Keywords

Algorithms Amino Acid Sequence Machine Learning Mutation Neural Networks, Computer

External Resources

View on PubMed Access via DOI PubMed (36603229)

Using machine learning to predict the effects and consequences of mutations in proteins.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals