Exploring Machine Learning Models to Uncover Pathways in ALS Pathogenesis Using Immunohistochemical Features

Journal: medRxiv
Published Date:

Abstract

Amyotrophic Lateral Sclerosis (ALS) is a degenerative disease of motor neurons that leads to muscle wasting, paralysis, and death, with an average life expectancy of 2–5 years. Approximately 10-15% of ALS cases are familial (fALS), typically linked to, but not always caused by identifiable inherited genetic mutations. The remaining 85–90% are considered sporadic ALS (sALS), which typically occurs without a clear family history. It is thought to result from a combination of genetic and non-genetic risk factors. ALS imposes heavy physical, psychological, and financial burdens on patients and caregivers. Early diagnosis is critical but remains challenging due to clinical variability and overlapping symptoms with other motor neuron disorders. Current diagnostic methods, including genetic testing and neurophysiological techniques, face limitations in reproducibility and accessibility, while machine learning offers potential by detecting patterns that traditional methods overlook. This study applies machine learning to characterise disease status in C9orf72-ALS patients and evaluate how pathological biomarkers relate to disease mechanisms. A tabular dataset from post-mortem brain tissue of 10 C9orf72-ALS patients and 10 controls was used to train models and benchmark results against Rifai et al. (2022). Models included random forest, support vector machine, xgboost, logistic regression, artificial neural networks, and ensembles, validated using 3-fold and 5-group cross-validation. The best model result was of random forest with 3-fold cross-validation for Iba1, achieving 88% sensitivity (p = 0.0011) and 83% specificity (p = 0.0004). However, as 3-fold cross-validation is less robust, we expect more reliable and stable results from 5-fold grouped cross-validation and, in future, repeated cross-validation approaches. Machine learning offers insights into ALS, with implications for potential patient stratification and case identification.

Authors

  • Jemimah Maria Kuruvilla; Olivia M Rifai; James Longden; Jenna M Gregory; Marta Vallejo