Tumors are complex tissues of cancerous cells surrounded by a heterogeneous cellular microenvironment with which they interact. Single-cell sequencing enables molecular characterization of single cells within the tumor. However, cell annotation-the a...
BACKGROUND: Cancer is a set of diseases characterized by unchecked cell proliferation and invasion of surrounding tissues. The many genes that have been genetically associated with cancer or shown to directly contribute to oncogenesis vary widely bet...
Integrative analysis of large-scale single-cell RNA sequencing (scRNA-seq) datasets can aggregate complementary biological information from different datasets. However, most existing methods fail to efficiently integrate multiple large-scale scRNA-se...
Multiplex assays of variant effect (MAVEs) are a family of methods that includes deep mutational scanning experiments on proteins and massively parallel reporter assays on gene regulatory sequences. Despite their increasing popularity, a general stra...
The advancement of highly multiplexed spatial technologies requires scalable methods that can leverage spatial information. We present MISTy, a flexible, scalable, and explainable machine learning framework for extracting relationships from any spati...
Few methods have been developed to investigate copy number variants (CNVs) based on their predicted pathogenicity. We introduce TADA, a method to prioritise pathogenic CNVs through assisted manual filtering and automated classification, based on an e...
BACKGROUND: The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle both as a biological process of ...
BACKGROUND: Accurate detection of somatic mutations is challenging but critical in understanding cancer formation, progression, and treatment. We recently proposed NeuSomatic, the first deep convolutional neural network-based somatic mutation detecti...
BACKGROUND: Enhancers are non-coding regions of the genome that control the activity of target genes. Recent efforts to identify active enhancers experimentally and in silico have proven effective. While these tools can predict the locations of enhan...
We present SquiggleNet, the first deep-learning model that can classify nanopore reads directly from their electrical signals. SquiggleNet operates faster than DNA passes through the pore, allowing real-time classification and read ejection. Using 1 ...