IEEE/ACM transactions on computational biology and bioinformatics
Apr 6, 2021
Single nucleotide polymorphisms (SNPs) are one type of genetic variations and each SNP represents a difference in a single DNA building block, namely a nucleotide. Previous research demonstrated that SNPs can be used to identify the correct source po...
Balancing selection is an important adaptive mechanism underpinning a wide range of phenotypes. Despite its relevance, the detection of recent balancing selection from genomic data is challenging as its signatures are qualitatively similar to those l...
BACKGROUND: Supervised learning from high-throughput sequencing data presents many challenges. For one, the curse of dimensionality often leads to overfitting as well as issues with scalability. This can bring about inaccurate models or those that re...
Neural networks : the official journal of the International Neural Network Society
Mar 26, 2021
A statistical framework for non-negative matrix factorization based on generalized dual Kullback-Leibler divergence, which includes members of the exponential family of models, is proposed. A family of algorithms is developed using this framework, in...
BACKGROUND: Survival and drug response are two highly emphasized clinical outcomes in cancer research that directs the prognosis of a cancer patient. Here, we have proposed a late multi omics integrative framework that robustly quantifies survival an...
Gradient Forests (GF) is a machine learning algorithm that is gaining in popularity for studying the environmental drivers of genomic variation and for incorporating genomic information into climate change impact assessments. Here we (i) provide the ...
Tolerance to stress conditions is vital for organismal survival, including bacteria under specific environmental conditions, antibiotics, and other perturbations. Some studies have described common modulation and shared genes during stress response t...
Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountab...
Elucidating functionality in non-coding regions is a key challenge in human genomics. It has been shown that intolerance to variation of coding and proximal non-coding sequence is a strong predictor of human disease relevance. Here, we integrate into...
Controlling quality of next-generation sequencing (NGS) data files is a necessary but complex task. To address this problem, we statistically characterize common NGS quality features and develop a novel quality control procedure involving tree-based ...