BACKGROUND: With the rapid development of high-throughput sequencing technology, the proteomics research becomes a trendy field in the post genomics era. It is necessary to identify all the native-encoding protein sequences for further function and p...
BACKGROUND: This work presents a machine learning strategy to increase sensitivity in tandem mass spectrometry (MS/MS) data analysis for peptide/protein identification. MS/MS yields thousands of spectra in a single run which are then interpreted by s...
Canadian journal of physiology and pharmacology
Nov 18, 2016
Diallyl trisulfide (DATS), a major garlic derivative, inhibits cell proliferation and triggers apoptosis in a variety of cancer cell lines. However, the effects of DATS on hepatic stellate cells (HSCs) remain unknown. The aim of this study was to ana...
A central problem in mass spectrometry analysis involves identifying, for each observed tandem mass spectrum, the corresponding generating peptide. We present a dynamic Bayesian network (DBN) toolkit that addresses this problem by using a machine lea...
Medical diagnostics is often a multi-attribute problem, necessitating sophisticated tools for analyzing high-dimensional biomedical data. Mining this data often results in two crucial bottlenecks: 1) high dimensionality of features used to represent ...
The article focus is the improvement of machine learning models capable of predicting protein expression levels based on their codon encoding. Support vector regression (SVR) and partial least squares (PLS) were used to create the models. SVR yields ...
BACKGROUND: The conjugation of ubiquitin to a substrate protein (protein ubiquitylation), which involves a sequential process--E1 activation, E2 conjugation and E3 ligation, is crucial to the regulation of protein function and activity in eukaryotes....
Artificial neural networks had their first heyday in molecular informatics and drug discovery approximately two decades ago. Currently, we are witnessing renewed interest in adapting advanced neural network architectures for pharmaceutical research b...
We introduce a new representation and feature extraction method for biological sequences. Named bio-vectors (BioVec) to refer to biological sequences in general with protein-vectors (ProtVec) for proteins (amino-acid sequences) and gene-vectors (Gene...
IEEE/ACM transactions on computational biology and bioinformatics
Sep 18, 2015
SEQUEST is a database-searching engine, which calculates the correlation score between observed spectrum and theoretical spectrum deduced from protein sequences stored in a flat text file, even though it is not a relational and object-oriental reposi...