SetBERT: the deep learning platform for contextualized embeddings and explainable predictions from high-throughput sequencing.
Journal:
Bioinformatics (Oxford, England)
Published Date:
Jun 25, 2025
Abstract
MOTIVATION: High-throughput sequencing is a modern sequencing technology used to profile microbiomes by sequencing thousands of short genomic fragments from the microorganisms within a given sample. This technology presents a unique opportunity for artificial intelligence to comprehend the underlying functional relationships of microbial communities. However, due to the unstructured nature of high-throughput sequencing data, nearly all computational models are limited to processing DNA sequences individually. This limitation causes them to miss out on key interactions between microorganisms, significantly hindering our understanding of how these interactions influence the microbial communities as a whole. Furthermore, most computational methods rely on post-processing of samples which could inadvertently introduce unintentional protocol-specific bias.
Authors
Keywords
No keywords available for this article.