Epigenetic modifications function in gene transcription, RNA metabolism, and other biological processes. However, multiple factors currently limit the scientific utility of epigenomic datasets generated for plants. Here, using deep-learning approache...
DNA is a complex molecule carrying the instructions an organism needs to develop, live and reproduce. In 1953, Watson and Crick discovered that DNA is composed of two chains forming a double-helix. Later on, other structures of DNA were discovered an...
Isolated sulfite oxidase deficiency (ISOD) is a rare hereditary metabolic disease caused by absence of functional sulfite oxidase (SO) due to mutations of the SUOX gene. SO oxidizes toxic sulfite and sulfite accumulation is associated with neurologic...
Genomic selection (GS) is revolutionizing conventional ways of developing new plants and animals. However, because it is a predictive methodology, GS strongly depends on statistical and machine learning to perform these predictions. For continuous ou...
Deep learning (DL) is revolutionizing the development of artificial intelligence systems. For example, before 2015, humans were better than artificial machines at classifying images and solving many problems of computer vision (related to object loca...
Species harbor extensive structural variation underpinning recent adaptive evolution. However, the causality between genomic features and the induction of new rearrangements is poorly established. Here, we analyze a global set of telomere-to-telomere...
A recent deluge of publicly available multi-omics data has fueled the development of machine learning methods aimed at investigating important questions in genomics. Although the motivations for these methods vary, a task that is commonly adopted is ...
As one of important epigenetic modifications, DNA N4-methylcytosine (4mC) plays a crucial role in controlling gene replication, expression, cell cycle, DNA replication, and differentiation. The accurate identification of 4mC sites is necessary to und...
BACKGROUND: Supervised learning from high-throughput sequencing data presents many challenges. For one, the curse of dimensionality often leads to overfitting as well as issues with scalability. This can bring about inaccurate models or those that re...
Gradient Forests (GF) is a machine learning algorithm that is gaining in popularity for studying the environmental drivers of genomic variation and for incorporating genomic information into climate change impact assessments. Here we (i) provide the ...