Clinical correlates of errors in machine-learning diagnostic model of autism spectrum disorder: Impact of sample cohorts.
Journal:
Autism : the international journal of research and practice
Published Date:
Aug 5, 2025
Abstract
Machine-learning models can assist in diagnosing autism but have biases. We examines the correlates of misclassifications and how training data affect model generalizability. The Social Responsive Scale data were collected from two cohorts in Taiwan: the clinical cohort comprised 1203 autistic participants and 1182 non-autistic comparisons, and the community cohort consisted of 35 autistic participants and 3297 non-autistic comparisons. Classification models were trained, and the misclassification cases were investigated regarding their associations with sex, age, intelligence quotient (IQ), symptoms from the child behavioral checklist (CBCL), and co-occurring psychiatric diagnosis. Models showed high within-cohort accuracy (clinical: sensitivity 0.91-0.95, specificity 0.93-0.94; community: sensitivity 0.91-1.00, specificity 0.89-0.96), but generalizability across cohorts was limited. When the community-trained model was applied to the clinical cohort, performance declined (sensitivity 0.65, specificity 0.95). In both models, non-autistic individuals misclassified as autistic showed elevated behavioral symptoms and attention-deficit hyperactivity disorder (ADHD) prevalence. Conversely, autistic individuals who were misclassified tended to show fewer behavioral symptoms and, in the community model, higher IQ and aggressive behavior but less social and attention problems. Error patterns of machine-learning model and the impact of training data warrant careful consideration in future research.Lay AbstractMachine-learning is a type of computer model that can help identify patterns in data and make predictions. In autism research, these models may support earlier or more accurate identification of autistic individuals. But to be useful, they need to make reliable predictions across different groups of people. In this study, we explored when and why these models might make mistakes-and how the kind of data used to train them affects their accuracy. Training models means using information to teach the computer model how to tell the difference between autistic and non-autistic individuals. We used the information from the Social Responsiveness Scale (SRS), which is a questionnaire that measures autistic features. We tested these models on two different groups: one from clinical settings and one from the general community. The models worked well when tested within the same type of group they were trained. However, a model trained on the community group did not perform as accurately when tested on the clinical group. Sometimes, the model got it wrong. For example, in the clinical group, some autistic individuals were mistakenly identified as non-autistic. These individuals tended to have fewer emotional or behavioral difficulties. In the community group, autistic individuals who were mistakenly identified as non-autistic had higher IQs and showed more aggressive behaviors but fewer attention or social problems. On the contrary, some non-autistic people were incorrectly identified as autistic. These people had more emotional or behavioral challenges and were more likely to have attention-deficit hyperactivity disorder (ADHD). These findings highlight that machine-learning models are sensitive to the type of data they are trained on. To build fair and accurate models for predicting autism, it is essential to consider where the training data come from and whether it represents the full diversity of individuals. Understanding these patterns of error can help improve future tools used in both research and clinical care.
Authors
Keywords
No keywords available for this article.