Random forest machine-learning algorithm classifies white- and brown-rot fungi according to the number of the genes encoding Carbohydrate-Active enZyme families.

Journal: Applied and environmental microbiology
PMID:

Abstract

UNLABELLED: Wood-rotting fungi play an important role in the global carbon cycle because they are the only known organisms that digest wood, the largest carbon stock in nature. In the present study, we used linear discriminant analysis and random forest (RF) machine learning algorithms to predict white- or brown-rot decay modes from the numbers of genes encoding Carbohydrate-Active enZymes with over 98% accuracy. Unlike other algorithms, RF identified specific genes involved in cellulose and lignin degradation, including auxiliary activities (AAs) family 9 lytic polysaccharide monooxygenases, glycoside hydrolase family 7 cellobiohydrolases, and AA family 2 peroxidases, as critical factors. This study sheds light on the complex interplay between genetic information and decay modes and underscores the potential of RF for comparative genomics studies of wood-rotting fungi.

Authors

  • Natsuki Hasegawa
    Department of Biomaterial Sciences, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan.
  • Masashi Sugiyama
  • Kiyohiko Igarashi
    Department of Biomaterial Sciences, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan.