BACKGROUND: Multi-layer perceptron (MLP) and radial basis function neural networks (RBFNN) have been shown to be effective in genome-enabled prediction. Here, we evaluated and compared the classification performance of an MLP classifier versus that o...
MOTIVATION: Supervised machine learning is widely applied to transcriptomic data to predict disease diagnosis, prognosis or survival. Robust and interpretable classifiers with high accuracy are usually favored for their clinical and translational pot...
IEEE/ACM transactions on computational biology and bioinformatics
Feb 12, 2016
Deciphering the gene disease association is an important goal in biomedical research. In this paper, we use a novel relevance measure, called HeteSim, to prioritize candidate disease genes. Two methods based on heterogeneous networks constructed usin...
Journal of computational biology : a journal of computational molecular cell biology
Feb 1, 2016
Constructing coexpression and association networks with omics data is crucial for studying gene-gene interactions and underlying biological mechanisms. In recent years, learning the structure of a Gaussian graphical model from high-dimensional data u...
Lung cancer is one of the diseases responsible for a large number of cancer related death cases worldwide. The recommended standard for screening and early detection of lung cancer is the low dose computed tomography. However, many patients diagnosed...
We introduce a new representation and feature extraction method for biological sequences. Named bio-vectors (BioVec) to refer to biological sequences in general with protein-vectors (ProtVec) for proteins (amino-acid sequences) and gene-vectors (Gene...
IEEE/ACM transactions on computational biology and bioinformatics
Oct 26, 2015
The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted fr...
Translation is an essential genetic process for understanding the mechanism of gene expression. Due to the large number of protein sequences generated in the post-genomic era, conventional methods are unable to identify Translation Initiation Site (T...
BACKGROUND: Gene ontology (GO) enrichment is commonly used for inferring biological meaning from systems biology experiments. However, determining differential GO and pathway enrichment between DNA-binding experiments or using the GO structure to cla...
BACKGROUND: Recent biochemical advances have led to inexpensive, time-efficient production of massive volumes of raw genomic data. Traditional machine learning approaches to genome annotation typically rely on large amounts of labeled data. The proce...