We are now entering a new era in protein sequence and structure annotation, with hundreds of millions of predicted protein structures made available through the AlphaFold database. These models cover nearly all proteins that are known, including thos...
Statistical applications in genetics and molecular biology
Sep 4, 2023
Proteins are the building blocks of all living things. Protein function must be ascertained if the molecular mechanism of life is to be understood. While CNN is good at capturing short-term relationships, GRU and LSTM can capture long-term dependenci...
Journal of chemical information and modeling
Aug 8, 2023
The prediction of peptide amyloidogenesis is a challenging problem in the field of protein folding. Large language models, such as the ProtBERT model, have recently emerged as powerful tools in analyzing protein sequences for applications, such as pr...
Journal of chemical theory and computation
Aug 8, 2023
Explainable and interpretable unsupervised machine learning helps one to understand the underlying structure of data. We introduce an ensemble analysis of machine learning models to consolidate their interpretation. Its application shows that restric...
BACKGROUND: Genetic variation in the human genome is a major determinant of individual disease risk, but the vast majority of missense variants have unknown etiological effects. Here, we present a robust learning framework for leveraging saturation m...
Journal of chemical information and modeling
Jul 27, 2023
Gaussian process (GP) is a Bayesian model which provides several advantages for regression tasks in machine learning such as reliable quantitation of uncertainty and improved interpretability. Their adoption has been precluded by their excessive comp...
Phosphorylation is one of the most important post-translational modifications and plays a pivotal role in various cellular processes. Although there exist several computational tools to predict phosphorylation sites, existing tools have not yet harne...
Predicting peptide detectability is useful in a variety of mass spectrometry (MS)-based proteomics applications, particularly targeted proteomics. However, most machine learning-based computational methods have relied solely on information from the p...
The turnover number k, a measure of enzyme efficiency, is central to understanding cellular physiology and resource allocation. As experimental k estimates are unavailable for the vast majority of enzymatic reactions, the development of accurate comp...
Journal of chemical information and modeling
Jul 3, 2023
Determining the catalytic site of enzymes is a great help for understanding the relationship between protein sequence, structure, and function, which provides the basis and targets for designing, modifying, and enhancing enzyme activity. The unique l...
Join thousands of healthcare professionals staying informed about the latest AI breakthroughs in medicine. Get curated insights delivered to your inbox.