Parkinson disease detection based on in-air dynamics feature extraction and selection using machine learning.
Journal:
Scientific reports
Published Date:
Jul 31, 2025
Abstract
Parkinson's disease (PD) is a progressive neurological disorder that impairs movement control, leading to symptoms such as tremors, stiffness, and bradykinesia. Early and accurate PD detection is essential for effective management and improving patient outcomes. Many researchers analyzing handwriting data for PD detection typically rely on computing statistical features over the entirety of the handwriting task. While this method can capture broad patterns, it has several limitations, including a lack of focus on dynamic change, oversimplified feature representation, a lack of directional information, and missing micro-movements or subtle variations. Consequently, these systems face challenges in achieving good performance accuracy, robustness, and sensitivity. To overcome this problem, we proposed an optimized PD detection methodology that incorporates newly developed dynamic kinematic features and machine learning (ML)-based techniques to capture movement dynamics during handwriting tasks. Unlike typical Parkinson's Disease (PD) detection methods, which only differentiate between PD and non-PD cases, our approach classifies PD patients into distinct stages-early, mid, and late-based on the age of the disease, reflecting its progression over time. In the procedure, we first extracted 65 newly developed kinematic features from the handwriting task, aiming to bring significant variations in acceleration, deceleration, and directional changes-subtle movements that traditional methods may struggle to detect. We also reused 23 existing kinematic features, resulting in a comprehensive new feature set. Next, we enhanced the kinematic features by applying statistical formulas to compute hierarchical features from the handwriting data. This approach allows us to capture subtle movement variations that distinguish PD patients from healthy controls. To further optimize the feature set, we applied the Sequential Forward Floating Selection method to select the most relevant features, reducing dimensionality and computational complexity. Finally, we employed an ML-based approach based on ensemble voting across top-performing tasks, achieving an impressive 96.99% accuracy on task-wise classification and 99.98% accuracy on task ensembles, surpassing the existing state-of-the-art model by 2% for the PaHaW dataset. This exceptional accuracy underscores the transformative potential of our approach in redefining the benchmarks for PD detection. Our code and data are available at: https://github.com/musaru/PD_PaHaW .