Long non-coding RNAs (lncRNAs) are closely associated with the regulation of gene expression, whose promoters play a crucial role in comprehensively understanding lncRNA regulatory mechanisms, functions and their roles in diseases. Due to limitations...
Journal of chemical information and modeling
40202493
The advent of powerful machine learning algorithms as well as the availability of high volume of pharmacological data has given new fuel to QSAR, opening new unprecedented options for deriving highly predictive models for assisting the rationale desi...
Sap flow, a critical process in plant water use and ecosystem water cycles, is often measured using thermal dissipation probes (TDP) due to their ease of installation and continuous data collection. However, sap flow data frequently include noise, ou...
High-throughput multi-omic molecular profiling allows the probing of biological systems at unprecedented resolution. However, integrating and interpreting high-dimensional, sparse, and noisy multimodal datasets remains challenging. Deriving new biolo...
Gene expression is the basis for cells to achieve various functions, while DNA methylation constitutes a critical epigenetic mechanism governing gene expression regulation. Here we propose DeepMethyGene, an adaptive recursive convolutional neural net...
Information leakage is an increasingly important topic in machine learning research for biomedical applications. When information leakage happens during a model's training, it risks memorizing the training data instead of learning generalizable prope...
BACKGROUND: Clustering scRNA-seq data plays a vital role in scRNA-seq data analysis and downstream analyses. Many computational methods have been proposed and achieved remarkable results. However, there are several limitations of these methods. First...
BACKGROUND: Chromosomes of species exhibit a variety of high-dimensional organizational features, and chromatin loops, which are fundamental structures in the three-dimensional (3D) structure of the genome. Chromatin loops are visible speckled patter...
Identifying DNA-binding proteins and their binding residues is critical for understanding diverse biological processes, but conventional experimental approaches are slow and costly. Existing machine learning methods, while faster, often lack accuracy...
Degeneracy in the genetic code allows many possible DNA sequences to encode the same protein. Optimizing codon usage within a sequence to meet organism-specific preferences faces combinatorial explosion. Nevertheless, natural sequences optimized thro...