Role of risk factors and their variable types in predicting noise-induced hearing loss using artificial intelligence algorithms.
Journal:
Hearing research
Published Date:
Jul 1, 2025
Abstract
Early prediction and warning of occupational noise-induced hearing loss (NIHL) in workers is critical. This study aimed to explore the role of risk factors and their variable types to NIHL prediction through machine learning (ML) techniques. Data on exposure and NIHL were sourced from the Chinese National Occupational Disease Surveillance Programs and field measurements involving 15,160 workers. We developed predictive models based on logistic regression, three tree-based algorithms (random forest [RF], extreme gradient boosting [XGBoost], light gradient boosting machine [LGBM]), and tabular neural network [TabNet]. Eight features, including age, sex, noise exposure duration (ED), A-weighted equivalent sound pressure (L), kurtosis, systolic blood pressure, diastolic blood pressure, and hearing protection device (HPD) usage, were evaluated through logistic regression and ML feature importance analyses. Models were trained using both original and categorized versions of the variables to compare the predictive value of variable types and assess the applicability of each algorithm. Multivariate logistic regression indicated that age, noise ED, L, sex, and HPD usage were significantly associated with NIHL (P < 0.05). Except for logistic regression, models built with original variable types using tree-based and TabNet algorithms outperformed those using categorized type (P < 0.05). The LGBM model utilizing original variable types, achieved the best performance on the test set [area under the curve (AUC) of 0.745 (95 % CI 0.729-0.763)]. Feature importance analysis revealed that L (LGBM), sex (XGBoost), age (RF), and kurtosis (TabNet) were key predictive variables, consistent with logistic regression results. Our study concludes that continuous variable type of risk factors provided superior predictive value compared to categorized type for NIHL. Tree-based and TabNet algorithms offer effective methods for assessing and predicting NIHL.
Authors
Keywords
Adult
Age Factors
Algorithms
Artificial Intelligence
China
Ear Protective Devices
Female
Hearing
Hearing Loss, Noise-Induced
Humans
Logistic Models
Machine Learning
Male
Middle Aged
Neural Networks, Computer
Noise, Occupational
Occupational Diseases
Occupational Exposure
Predictive Value of Tests
Risk Assessment
Risk Factors
Sex Factors
Time Factors
Young Adult