Using game theory and decision decomposition to effectively discern and characterise bi-locus diseases.

Journal: Artificial intelligence in medicine
PMID:

Abstract

In order to gain insight into oligogenic disorders, understanding those involving bi-locus variant combinations appears to be key. In prior work, we showed that features at multiple biological scales can already be used to discriminate among two types, i.e. disorders involving true digenic and modifier combinations. The current study expands this machine learning work towards dual molecular diagnosis cases, providing a classifier able to effectively distinguish between these three types. To reach this goal and gain an in-depth understanding of the decision process, game theory and tree decomposition techniques are applied to random forest predictors to investigate the relevance of feature combinations in the prediction. A machine learning model with high discrimination capabilities was developed, effectively differentiating the three classes in a biologically meaningful manner. Combining prediction interpretation and statistical analysis, we propose a biologically meaningful characterization of each class relying on specific feature strengths. Figuring out how biological characteristics shift samples towards one of three classes provides clinically relevant insight into the underlying biological processes as well as the disease itself.

Authors

  • Nassim Versbraegen
    Interuniversity Institute for Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium; Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium. Electronic address: nversbra@ulb.ac.be.
  • Aziz Fouché
    Interuniversity Institute for Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium; École normale supérieure Paris-Saclay, 94230 Cachan, France. Electronic address: afouche@ens-paris-saclay.fr.
  • Charlotte Nachtegael
    Interuniversity Institute for Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium; Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium.
  • Sofia Papadimitriou
    Interuniversity Institute for Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium; Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium; Artificial Intelligence Lab, Vrije Universiteit Brussel, 1050 Brussels, Belgium.
  • Andrea Gazzo
    Interuniversity Institute for Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium; Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium; Center for Medical Genetics, Reproduction and Genetics, Reproduction Genetics and Regenerative Medicine, Vrije Universiteit Brussel, UZ Brussel, 1050 Brussels, Belgium.
  • Guillaume Smits
    Interuniversity Institute for Bioinformatics in Brussels, ULB-VUB, 1050 Brussels, Belgium; Hôpital Universitaire des Enfants Reine Fabiola, Université Libre de Bruxelles, 1020 Brussels, Belgium; Center for Medical Genetics, Hôpital Erasme, Université Libre de Bruxelles, 1070 Brussels, Belgium.
  • Tom Lenaerts
    Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan.