Proceedings of the National Academy of Sciences of the United States of America
34187888
Recent progress in DNA synthesis and sequencing technology has enabled systematic studies of protein function at a massive scale. We explore a deep mutational scanning study that measured the transcriptional repression function of 43,669 variants of ...
To improve synthetic media for protein expression in Escherichia coli, a strategy using deep neural networks (DNN) and Bayesian optimization was performed in this study. To obtain training data for a deep learning algorithm, E. coli harvesting a plas...
To reach their final destinations, outer membrane proteins (OMPs) of gram-negative bacteria undertake an eventful journey beginning in the cytosol. Multiple molecular machines, chaperones, proteases, and other enzymes facilitate the translocation and...
Transcriptomic data is accumulating rapidly; thus, scalable methods for extracting knowledge from this data are critical. Here, we assembled a top-down expression and regulation knowledge base for Escherichia coli. The expression component is a 1035-...
Owing to the nondeterministic and nonlinear nature of gene expression, the steady-state intracellular protein abundance of a clonal population forms a distribution. The characteristics of this distribution, including expression strength and noise, ar...
Proteome analysis currently heavily relies on tandem mass spectrometry (MS/MS), which does not fully utilize MS1 features, as many precursors remain unselected for MS/MS fragmentation, especially in the cases of low abundance samples and wide abundan...
Protein solubility is a critical parameter that determines the stability, activity, and functionality of proteins, with broad and far-reaching implications in biotechnology and biochemistry. Accurate prediction and control of protein solubility are e...
The first step in bottom-up proteomics is the assignment of measured fragmentation mass spectra to peptide sequences, also known as peptide spectrum matches. In recent years novel algorithms have pushed the assignment to new heights; unfortunately, d...
Journal of chemical theory and computation
40211504
This work introduces LEGOLAS, a fully open source TorchANI-based neural network model designed to predict NMR chemical shifts for protein backbone atoms (N, Cα, Cβ, C', HN, Hα). LEGOLAS has been designed to be fast without loss of accuracy, as our mo...
The DNA binding of most Escherichia coli Transcription Factors (TFs) has not been comprehensively mapped, and few have models that can quantitatively predict binding affinity. We report the global mapping of in vivo DNA binding for 139 E. coli TFs us...