In post-genome-wide association study era, interpretation of noncoding variants remains a significant challenge due to their complexity and the limited understanding of their functions. Here, we developed MIRACN, a novel residual convolutional neural...
Cancer is a major public health problem while liver cancer is the main cause of global cancer-related deaths. The previous study demonstrates that the 5-year survival rate for advanced liver cancer is only 30%. Few of the first-line targeted drugs in...
The prediction of enzyme kinetic parameters is crucial for screening enzymes with high catalytic efficiency and desired characteristics to catalyze natural or non-natural reactions. Data-driven machine learning models have been explored to reduce exp...
The recent progress of deep generative models in modeling complex real-world data distributions has enabled the generation of novel compounds with potential therapeutic applications for various diseases. However, most studies fail to optimize the pro...
Identification of intrinsically disordered regions (IDRs) in proteins is essential for understanding fundamental cellular processes. The IDRs can be divided into long disordered regions (LDRs) and short disordered regions (SDRs) according to their le...
Integrating and analyzing multiple omics datasets, such as genomics, environmental influences, and imaging endophenotypes, has yielded an abundance of candidate biomarkers. However, translating such findings into beneficial clinical knowledge for dis...
Accurate prediction of pathogenic variants in human disease-associated genes would have a profound effect on clinical decision-making; however, it remains a significant challenge due to the overwhelming number of these variants. We propose graph neur...
Non-coding RNAs (ncRNAs) play crucial roles in drug resistance and sensitivity, making them important biomarkers and therapeutic targets. However, predicting ncRNA-drug associations is challenging due to issues such as dataset imbalance and sparsity,...
Multi-omics data often suffer from the "big $p$, small $n$" problem where the dimensionality of features is significantly larger than the sample size, making the integration of multi-omics data for survival analysis of a specific cancer particularly ...
Identifying spatial domains for spatial transcriptomics is crucial for achieving comprehensive insights into the pathogenesis of gene expression. Increasingly, computational methods based on graph neural networks are being developed for spatial trans...