Machine Learning Algorithm for Noise Reduction and Disease-Causing Gene Feature Extraction in Gene Sequencing Data
Journal:
arXiv
Published Date:
May 26, 2025
Abstract
In this study, we propose a machine learning-based method for noise reduction
and disease-causing gene feature extraction in gene sequencing DeepSeqDenoise
algorithm combines CNN and RNN to effectively remove the sequencing noise, and
improves the signal-to-noise ratio by 9.4 dB. We screened 17 key features by
feature engineering, and constructed an integrated learning model to predict
disease-causing genes with 94.3% accuracy. We successfully identified 57 new
candidate disease-causing genes in a cardiovascular disease cohort validation,
and detected 3 missed variants in clinical applications. The method
significantly outperforms existing tools and provides strong support for
accurate diagnosis of genetic diseases.