BioDiscViz: A visualization support and consensus signature selector for BioDiscML results.

Journal: PloS one
PMID:

Abstract

Machine learning (ML) algorithms are powerful tools to find complex patterns and biomarker signatures when conventional statistical methods fail to identify them. While the ML field made significant progress, state of the art methodologies to build efficient and non-overfitting models are not always applied in the literature. To this purpose, automatic programs, such as BioDiscML, were designed to identify biomarker signatures and correlated features while escaping overfitting using multiple evaluation strategies, such as cross validation, bootstrapping and repeated holdout. To further improve BioDiscML and reach a broader audience, better visualization support and flexibility in choosing the best models and signatures are needed. Thus, to provide researchers with an easily accessible and usable tool for in depth investigation of the results from BioDiscML outputs, we developed a visual interaction tool called BioDiscViz. This tool provides summaries, tables and graphics, in the form of Principal Component Analysis (PCA) plots, UMAP, t-SNE, heatmaps and boxplots for the best model and the correlated features. Furthermore, this tool also provides visual support to extract a consensus signature from BioDiscML models using a combination of filters. BioDiscViz will be a great visual support for research using ML, hence new opportunities in this field by opening it to a broader community.

Authors

  • Sophiane Bouirdene
    Département de Médecine Moléculaire du CHU de Québec, Université Laval, Québec, QC, Canada.
  • Mickaël Leclercq
    Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, Québec, Canada.
  • Léopold Quitté
    Département de Médecine Moléculaire du CHU de Québec, Université Laval, Québec, QC, Canada.
  • Steve Bilodeau
    Département d'oncologie, Centre de recherche du CHU de Québec - Université Laval, Québec, Québec, Canada.
  • Arnaud Droit
    Proteomics platform, CHU de Québec - Université Laval Research Center, Québec City, Québec, Canada; Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, Québec, Canada; Département de Médecine Moléculaire, Faculté de médecine, Université Laval, Québec City, QC, Canada. Electronic address: arnaud.droit@crchuq.ulaval.ca.