Accelerating protein engineering with fitness landscape modeling and reinforcement learning

Journal: bioRxiv

Published Date: Jan 1, 2025

Abstract

Protein engineering holds significant promise for designing proteins with customized functions, yet the vast landscape of potential mutations versus limited lab capacity constrains the discovery of optimal sequences. To address this, we present the µProtein framework, which accelerates protein engineering by combining µFormer, a deep learning model for accurate mutational effect prediction, with µSearch, a reinforcement learning algorithm designed to efficiently navigate the protein fitness landscape using µFormer as an oracle. µProtein leverages single mutation data to predict optimal sequences with complex, multi-amino acid mutations through its modeling of epistatic interactions and a multi-step search strategy. Except from state-of-the-art performance on benchmark datasets, µProtein identified high-gain-of-function multi-point mutants for the enzyme β-lactamase, surpassing the highest known activity level, in wet-lab, trained solely on single mutation data. These results demonstrate µProtein’s capability to discover impactful mutations across vast protein sequence space, offering a robust, efficient approach for protein optimization.

Authors

Haoran Sun; Liang He; Pan Deng; Guoqing Liu; Zhiyu Zhao; Yuliang Jiang; Chuan Cao; Fusong Ju; Lijun Wu; Haiguang Liu; Tao Qin; Tie-Yan Liu

External Resources

View on bioRxiv Access via DOI

Accelerating protein engineering with fitness landscape modeling and reinforcement learning

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Accelerating protein engineering with fitness landscape modeling and reinforcement learning

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals