UNLABELLED: Annotating genetic variants, especially non-coding variants, for the purpose of identifying pathogenic variants remains a challenge. Combined annotation-dependent depletion (CADD) is an algorithm designed to annotate both coding and non-c...
UNLABELLED: Protein function prediction (PFP) is an automated function prediction method that predicts Gene Ontology (GO) annotations for a protein sequence using distantly related sequences and contextual associations of GO terms. Extended similarit...
SUMMARY: Currently available and frequently used tools for annotating antibiotic resistance genes (ARGs) in genomes and metagenomes provide results using inconsistent nomenclature. This makes the comparison of different ARG annotation outputs challen...
MOTIVATION: Understanding the protein sequence-function relationship is essential for advancing protein biology and engineering. However, <1% of known protein sequences have human-verified functions. While deep-learning methods have demonstrated prom...
MOTIVATION: Leveraging deep learning for the representation learning of Gene Ontology (GO) and Gene Ontology Annotation (GOA) holds significant promise for enhancing downstream biological tasks such as protein-protein interaction prediction. Prior ap...
SUMMARY: In single-cell transcriptomics, inconsistent cell type annotations due to varied naming conventions and hierarchical granularity impede data integration, machine learning applications, and meaningful evaluations. To address this challenge, w...
MOTIVATION: Progress in sequencing technology has led to determination of large numbers of protein sequences, and large enzyme databases are now available. Although many computational tools for enzyme annotation were developed, sequence information i...
MOTIVATION: As the biological roles and disease implications of non-coding RNAs continue to emerge, the need to thoroughly characterize previously unexplored non-coding RNAs becomes increasingly urgent. These molecules hold potential as biomarkers an...
The organization of subcellular components in a cell is critical for its function and studying cellular processes, protein-protein interactions, identifying potential drug targets, network analysis, and other systems biology mechanisms. Determining p...
Spatial transcriptomics technology has revolutionized our understanding of cellular systems by capturing RNA transcript levels in their original spatial context. Single-cell spatial transcriptomics (scST) offers single-cell resolution expression leve...