Prediction of Adeno-Associated Virus Fitness with a Protein Language-Based Machine Learning Model.

Journal: Human gene therapy

PMID: 40241334

Abstract

Adeno-associated virus (AAV)-based therapeutics have the potential to transform the lives of patients by delivering one-time treatments for a variety of diseases. However, a critical challenge to their widespread adoption and distribution is the high cost of goods. Reducing manufacturing costs by developing AAV capsids with improved yield, or fitness, is key to making gene therapies more affordable. AAV fitness is largely determined by the amino acid sequence of the capsid, however, engineered AAVs are rarely optimized for manufacturability. Here, we report a state-of-the art machine learning (ML) model that predicts the fitness of AAV2 capsid mutants based on the amino acid sequence of the capsid monomer. By combining a protein language model (PLM) and classical ML techniques, our model achieved a significantly high prediction accuracy (Pearson correlation = 0.818) for capsid fitness. Importantly, tests on completely independent datasets showed robustness and generalizability of our model, even for multimutant AAV capsids. Our accurate ML-based model can be used as a surrogate for laborious experiments, thus saving time and resources, and can be deployed to increase the fitness of clinical AAV capsids to make gene therapies economically viable for patients.

Authors

Jason Wu

Genomic Medicine Unit, Sanofi, Waltham, Massachusetts, USA.
Yu Qiu

The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.
Eugenia Lyashenko

Genomic Medicine Unit, Sanofi, Waltham, Massachusetts, USA.
Tess Torregrosa

Genomic Medicine Unit, Sanofi, Waltham, Massachusetts, USA.
Edith L Pfister

Genomic Medicine Unit, Sanofi, Waltham, Massachusetts, USA.
Michael J Ryan

Orthopaedic Machine Learning Laboratory, Orthopaedic Intelligence LLC, Cleveland Heights, OH.
Christian Mueller

Department of Cardiology and Cardiovascular Research Institute Basel (CRIB), University Heart Center Basel, University Hospital Basel, University of Basel, Switzerland.
Sourav R Choudhury

Genomic Medicine Unit, Sanofi, Waltham, Massachusetts, USA.

Keywords

Capsid Capsid Proteins Dependovirus Genetic Therapy Genetic Vectors Humans Machine Learning Mutation

External Resources

View on PubMed Access via DOI PubMed (40241334)

Prediction of Adeno-Associated Virus Fitness with a Protein Language-Based Machine Learning Model.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals