A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data.

Journal: Journal of visualized experiments : JoVE
Published Date:

Abstract

A variety of biological sequence classification tasks, such as species classification, gene function classification and viral host classification, are expected processes in many metagenomic data analyses. Since metagenomic data contain a large number of novel species and genes, high-performing classification algorithms are needed in many studies. Biologists often encounter challenges in finding suitable sequence classification and annotation tools for a specific task and are often not able to construct a corresponding algorithm on their own because of a lack of the necessary mathematical and computational knowledge. Deep learning techniques have recently become a popular topic and show strong advantages in many classification tasks. To date, many highly packaged deep learning packages, which make it possible for biologists to construct deep learning frameworks according to their own needs without in-depth knowledge of the algorithm details, have been developed. In this tutorial, we provide a guideline for constructing an easy-to-use deep learning framework for sequence classification without the need for sufficient mathematical knowledge or programming skills. All the code is optimized in a virtual machine so that users can directly run the code using their own data.

Authors

  • Zhencheng Fang
    State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, No.5 Yiheyuan Road Haidian District, Beijing 100871, China.
  • Hongwei Zhou
    Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, 100700, China.