Species annotation using a k-mer based KNN model.

Journal: Bioinformation

Published Date: Sep 30, 2024

Abstract

Bacterial identification is a critical process in microbiology, clinical diagnostics, environmental monitoring, and food safety. Machine learning holds great promise for improving bacterial identification by increasing accuracy, speed, and scalability. However, challenges such as data dependency, model interpretability, and computational demands must be addressed to fully realize it's potential. k-mer based bacterial identification algorithm is an attempt to address these issues. Sequence matching is completed using the KNN technique. This included feature extraction, dataset preparation, classifier training, and label prediction based on k-mer frequency distribution similarity. The algorithm's performance has been cross-checked through accuracy assessment metrics such as F1 score and precision with an impressive 93% accuracy rate.

Authors

Srushti Sangar

Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth (Deemed to be University), Pune, Maharashtra, India.
Prathamesh Kolage

Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth (Deemed to be University), Pune, Maharashtra, India.
Pritee Chunarkar-Patil

Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth (Deemed to be University), Pune, Maharashtra, India.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (39917243)

Species annotation using a k-mer based KNN model.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals