Multi-trait and multi-environment data are common in animal and plant breeding programs. However, what is lacking are more powerful statistical models that can exploit the correlation between traits to improve prediction accuracy in the context of ge...
Genomic selection is revolutionizing plant breeding and therefore methods that improve prediction accuracy are useful. For this reason, active research is being conducted to build and test methods from other areas and adapt them to the context of gen...
With the advance of next-generation sequencing (NGS) technologies, non-invasive prenatal testing (NIPT) has been developed and employed in fetal aneuploidy screening on 13-/18-/21-trisomies through detecting cell-free fetal DNA (cffDNA) in maternal b...
Cancer genomic analysis requires accurate identification of somatic variants in sequencing data. Manual review to refine somatic variant calls is required as a final step after automated processing. However, manual variant refinement is time-consumin...
Enhancers are cis-acting sequences that regulate transcription rates of their target genes in a cell-specific manner and harbor disease-associated sequence variants in cognate cell types. Many complex diseases are associated with enhancer malfunction...
Despite rapid advances in sequencing technologies, accurately calling genetic variants present in an individual genome from billions of short, errorful sequence reads remains challenging. Here we show that a deep convolutional neural network can call...
BACKGROUND: Detection of highly divergent or yet unknown viruses from metagenomics sequencing datasets is a major bioinformatics challenge. When human samples are sequenced, a large proportion of assembled contigs are classified as "unknown", as conv...
Journal of computational biology : a journal of computational molecular cell biology
Aug 22, 2018
The identification of transcription factor binding sites (TFBSs) is a problem for which computational methods offer great hope. Thus far, the expectation maximization (EM) technique has been successfully utilized in finding TFBSs in DNA sequences, bu...
Second-generation DNA sequencing techniques generate short reads that can result in fragmented genome assemblies. Third-generation sequencing platforms mitigate this limitation by producing longer reads that span across complex and repetitive regions...
BACKGROUND: Spastic cerebral palsy (CP) is a leading cause of physical disability. Most people with spastic CP are born with it, but early diagnosis is challenging, and no current biomarker platform readily identifies affected individuals. The aim of...