Multinomial Convolutions for Joint Modeling of Regulatory Motifs and Sequence Activity Readouts.

Journal: Genes

Published Date: Sep 8, 2022

Abstract

A common goal in the convolutional neural network (CNN) modeling of genomic data is to discover specific sequence motifs. Post hoc analysis methods aid in this task but are dependent on parameters whose optimal values are unclear and applying the discovered motifs to new genomic data is not straightforward. As an alternative, we propose to learn convolutions as multinomial distributions, thus streamlining interpretable motif discovery with CNN model fitting. We developed MuSeAM (Multinomial CNNs for Sequence Activity Modeling) by implementing multinomial convolutions in a CNN model. Through benchmarking, we demonstrate the efficacy of MuSeAM in accurately modeling genomic data while fitting multinomial convolutions that recapitulate known transcription factor motifs.

Authors

Minjun Park

Department of Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA.
Salvi Singh

Department of Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA.
Samin Rahman Khan

Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1000, Bangladesh.
Mohammed Abid Abrar

Computer Science and Engineering, Brac University, Dhaka 1212, Bangladesh.
Francisco Grisanti

Department of Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA.
M Sohel Rahman

Department of CSE, BUET, ECE Building, West Palasi, Dhaka 1205, Bangladesh. Electronic address: msrahman@cse.buet.ac.bd.
Md Abul Hassan Samee

Department of Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA.

Keywords

Genomics Neural Networks, Computer Transcription Factors

External Resources

View on PubMed Access via DOI PubMed (36140783)

Multinomial Convolutions for Joint Modeling of Regulatory Motifs and Sequence Activity Readouts.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals