The discovery of molecular relationships from high-dimensional data is a major open problem in bioinformatics. Machine learning and feature attribution models have shown great promise in this context but lack causal interpretation. Here, we show that...
Metagenomics, particularly genome-resolved metagenomics, have significantly deepened our understanding of microbes, illuminating their taxonomic and functional diversity and roles in ecology, physiology, and evolution. However, eukaryotic populations...
With the ongoing evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and its increasing adaptation to humans, several variants of concern (VOCs) and variants of interest (VOIs) have been identified since late 2020. These include...
The Journal of molecular diagnostics : JMD
Feb 13, 2025
The widespread adoption of next-generation sequencing technology in molecular pathology has enabled us to interrogate the genome as never before. The huge quantities of data generated by sequencing, the enormous complexity of human and microbial gene...
Bacteriophages (phages) are the most predominant and genetically diverse biological entities on Earth. Phages are viruses that infect bacteria and encode numerous proteins with potential biotechnological application. However, most phage-encoded prote...
The rapid advancement of single-cell technologies has created an urgent need for effective methods to integrate and harmonize single-cell data. Technical and biological variations across studies complicate data integration, while conventional tools o...
Accurately labeling large datasets is important for biomedical machine learning yet challenging while modern data augmentation methods may generate noise in the training data, which may deteriorate machine learning model performance. Existing approac...
Forecasting the occurrence and absence of novel disease outbreaks is essential for disease management, yet existing methods are often context-specific, require a long preparation time, and non-outbreak prediction remains understudied. To address this...
Machine learning (ML) is changing the world of computational protein design, with data-driven methods surpassing biophysical-based methods in experimental success. However, they are most often reported as case studies, lack integration and standardiz...
Gene microarray technology provides an efficient way to diagnose cancer. However, microarray gene expression data face the challenges of high-dimension, small-sample, and multi-class imbalance. The coupling of these challenges leads to inaccurate res...