Constrained neuro fuzzy inference methodology for explainable personalised modelling with applications on gene expression data.

Journal: Scientific reports
PMID:

Abstract

Interpretable machine learning models for gene expression datasets are important for understanding the decision-making process of a classifier and gaining insights on the underlying molecular processes of genetic conditions. Interpretable models can potentially support early diagnosis before full disease manifestation. This is particularly important yet, challenging for mental health. We hypothesise this is due to extreme heterogeneity issues which may be overcome and explained by personalised modelling techniques. Thus far, most machine learning methods applied to gene expression datasets, including deep neural networks, lack personalised interpretability. This paper proposes a new methodology named personalised constrained neuro fuzzy inference (PCNFI) for learning personalised rules from high dimensional datasets which are structurally and semantically interpretable. Case studies on two mental health related datasets (schizophrenia and bipolar disorders) have shown that the relatively short and simple personalised fuzzy rules provided enhanced interpretability as well as better classification performance compared to other commonly used machine learning methods. Performance test on a cancer dataset also showed that PCNFI matches previous benchmarks. Insights from our approach also indicated the importance of two genes (ATRX and TSPAN2) as possible biomarkers for early differentiation of ultra-high risk, bipolar and healthy individuals. These genes are linked to cognitive ability and impulsive behaviour. Our findings suggest a significant starting point for further research into the biological role of cognitive and impulsivity-related differences. With potential applications across bio-medical research, the proposed PCNFI method is promising for diagnosis, prognosis, and the design of personalised treatment plans for better outcomes in the future.

Authors

  • Balkaran Singh
    Knowledge Engineering and Discovery Research Innovation (KEDRI), School of Engineering Computer and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand. balkaran.singh@aut.ac.nz.
  • Maryam Doborjeh
    Knowledge Engineering and Discovery Research Institute (KEDRI), Auckland University of Technology, Auckland, New Zealand.
  • Zohreh Doborjeh
    Knowledge Engineering and Discovery Research Institute (KEDRI), Auckland University of Technology, Auckland, New Zealand. zgholami@aut.ac.nz.
  • Sugam Budhraja
    Knowledge Engineering and Discovery Research Innovation (KEDRI), School of Engineering Computer and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand.
  • Samuel Tan
    Lee Kong Chian School of Medicine, Nanyang Technological University (NTU), Singapore, Singapore.
  • Alexander Sumich
    Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, AUT Tower, 7th floor, 2 Wakefield Street, Auckland, 1010, New Zealand.
  • Wilson Goh
    Lee Kong Chian School of Medicine, Nanyang Technological University (NTU), Singapore, Singapore.
  • Jimmy Lee
    Institute of Mental Health, Singapore, Singapore.
  • Edmund Lai
    Knowledge Engineering and Discovery Research Innovation (KEDRI), School of Engineering Computer and Mathematical Sciences, Auckland University of Technology, Auckland, New Zealand.
  • Nikola Kasabov
    Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland 1010, New Zealand. Electronic address: nkasabov@aut.ac.nz.