Cancer is driven by genomic alterations, but the processes causing this disease are largely performed by proteins. However, proteins are harder and more expensive to measure than genes and transcripts. To catalyze developments of methods to infer pro...
With advances in sequencing technology, a vast amount of genomic sequence information has become available. However, annotating biological functions particularly of non-protein-coding regions in genome sequences without experiments is still a challen...
Machine learning algorithms trained to predict the regulatory activity of nucleic acid sequences have revealed principles of gene regulation and guided genetic variation analysis. While the human genome has been extensively annotated and studied, mod...
In recent years, numerous applications have demonstrated the potential of deep learning for an improved understanding of biological processes. However, most deep learning tools developed so far are designed to address a specific question on a fixed d...
Computer methods and programs in biomedicine
Jun 27, 2020
BACKGROUND: The size of genomics data has been growing rapidly over the last decade. However, the conventional data analysis techniques are incapable of processing this huge amount of data. For the efficient processing of high dimensional datasets, i...
The training machine learning algorithm from an imbalanced data set is an inherently challenging task. It becomes more demanding with limited samples but with a massive number of features (high dimensionality). The high dimensional and imbalanced dat...
Advances in high-throughput sequencing technologies have reduced the cost of genotyping dramatically and led to genomic prediction being widely used in animal and plant breeding, and increasingly in human genetics. Inspired by the efficient computing...
Most organisms are more closely related to nearby than distant members of their species, creating spatial autocorrelations in genetic data. This allows us to predict the location of origin of a genetic sample by comparing it to a set of samples of kn...