BetaAlign: a deep learning approach for multiple sequence alignment.
Journal:
Bioinformatics (Oxford, England)
PMID:
39775454
Abstract
MOTIVATION: Multiple sequence alignments (MSAs) are extensively used in biology, from phylogenetic reconstruction to structure and function prediction. Here, we suggest an out-of-the-box approach for the inference of MSAs, which relies on algorithms developed for processing natural languages. We show that our artificial intelligence (AI)-based methodology can be trained to align sequences by processing alignments that are generated via simulations, and thus different aligners can be easily generated for datasets with specific evolutionary dynamics attributes. We expect that natural language processing (NLP) solutions will replace or augment classic solutions for computing alignments, and more generally, challenging inference tasks in phylogenomics.