Development of a novel musculoskeletal hypothesis using sparse Group Factor Analysis: the ADVANCE cohort

Journal: medRxiv
Published Date:

Abstract

Musculoskeletal conditions are a leading global cause of disability, yet the factors influencing long-term musculoskeletal health, particularly following trauma, remain incompletely understood. This study applies sparse Group Factor Analysis, a hierarchical unsupervised machine learning method, to the ADVANCE cohort—a longitudinal dataset of 1445 UK Afghanistan War servicemen—to identify latent structures in multimodal clinical data. Study 1 validated the approach by rediscovering known group-level patterns between combat-injured and non-injured participants, including poorer outcomes in pain, mobility, and bone health among those with lower limb loss. Study 2 explored the Injured, non-amputee subgroup without prespecified labels to identify new hypothesis-generating clusters that could subsequently be tested using standard hypothesis testing methods. A subgroup of 125 individuals with worse musculoskeletal outcomes was uncovered. This group had greater body mass, higher injury severity, and a higher prevalence of head injury. These findings led to a novel hypothesis: that head injury, including potential traumatic brain injury, is associated with long-term musculoskeletal deterioration. This hypothesis is supported by literature in both athletic and military populations and will be tested in follow-up analyses. Our findings demonstrate how sparse Group Factor Analysis, combined with clinical insight, can uncover hidden patterns in large-scale datasets and generate testable, clinically relevant hypotheses that inform prevention, treatment, and rehabilitation strategies. Musculoskeletal conditions such as osteoarthritis and low back pain are the second largest contributor to global disability. They can be caused by a variety of factors such as ageing, genetics, lifestyle, and injury. Understanding the interconnectedness of long-term musculoskeletal outcomes following injury could help improve prevention, intervention and rehabilitation initiatives to reduce resulting disability. In this study, we describe a new machine learning methodology called Sparse Group Factor Analysis that we apply to a complex dataset from a military cohort study to generate new research hypotheses. The first study (n=1145) validated our approach by generating hypotheses that we had already investigated via traditional methods. The second study used a sub-set of the cohort (125 participants with poor musculoskeletal outcomes). This showed a link between poor musculoskeletal outcomes and head injury, resulting in a new hypothesis that a head injury or traumatic brain injury may contribute to poor musculoskeletal outcomes. We will test this hypothesis using traditional methods in follow-up analyses. We have demonstrated how Spare Group Factor Analysis can be used alongside clinical knowledge to find hidden patterns in in large, complex datasets to provide information that could inform improved prevention of future musculoskeletal injury, intervention and rehabilitation strategies.

Authors

  • Fraje CE Watson; Fabio S Ferreira; Balasundaram Kadirvelu; Alex N Bennett; Aldo A Faisal; Neil Graham; Harriet Kemp; Paul Cullinan; Christopher Boos; Nicola T Fear; Anthony MJ Bull

Categories