Machine learning framework for oxytetracycline removal using nanostructured cupric oxide supported on magnetic chitosan alginate biocomposite.
Journal:
Scientific reports
Published Date:
Jul 18, 2025
Abstract
This research proposes a machine learning controlled method for removing the antibiotic oxytetracycline (OTC) from liquids through the use of nanostructured cupric oxide (CuO) nanoparticles. These nanoparticles are attached to magnetic chitosan/alginate biocomposites (CuO-M-CAB). The dataset consisted of 42 experimentally generated samples, systematically created using key parameters (X1: antibiotic concentration, X2: pH, X3: reaction time, and X4: adsorbent concentration). Prior to applying machine learning models, preprocessing steps were performed, including normalization using Min-Max Scaling to confine all features within the [0,1] range, outlier detection and removal of anomalous values, correlation analysis to avoid redundancy and multicollinearity, and data splitting into training and testing sets at an 80:20 ratio, along with K-fold cross-validation (kâ=â5) for robust model evaluation. The study assesses the accurate predictions of specific models, including Tikhonov Regularization, Yandex Boosting, and Particle Swarm Optimization (PSO), to improve removal efficiency and provide analysis of the components influencing the method. The Tikhonov model demonstrates high accuracy with R values of 0.973 for training data and 0.958 for testing data, showcasing strong generalization capabilities, while maintaining low error rates, as evidenced by RMSE and MAE values of 4.62 and 3.65 for training data, and 5.04 and 4.21 for testing data, respectively. Although there are slight indications of overfitting, the differences remain within acceptable limits, not significantly impacting the model's generalization ability. In contrast, the Yandex Boosting model exhibits weaker performance on testing data, with a negative mean residual of -â1.16 and a higher standard deviation of 4.66, indicating underestimation and increased error. The substantial discrepancy between training (residual standard deviation of 1.28) and testing (residual standard deviation of 4.66) performance highlights significant overfitting and poorer generalization. Greater scatter in residual and scatter plots for testing data further underscores reduced predictive accuracy on new data. Consequently, the Tikhonov model outperforms the Yandex Boosting model due to its higher accuracy and lower error rates, whereas Yandex Boosting, despite strong training performance, suffers from overfitting, leading to inferior testing performance. In Yandex Boosting model, SHapley Additive exPlanations (SHAP) helped us gain a deeper understanding of the impact of the features. SHAP was utilized to analyze the impact of parameters on removal efficiency, revealing that parameters X2 (pH) and X3 (reaction time) are the most influential factors. We also employed the Particle Swarm Optimization (PSO) model to identify the optimal parameter values. PSO identified the optimal combination of parameters to maximize removal efficiency. The PSO allowed us not only to identify these optimal values but also to analyze the relationships between the variables. The results from the correlation analysis indicate that X2 and X3 have a strong positive relationship with removal efficiency, while X4 (adsorbent concentration) has a significant negative impact. These findings emphasize the importance of optimizing parameters X2 and X3 in the removal process and suggest that optimizing these parameters can lead to improved removal efficiency. This research introduces an efficient method for removing oxytetracycline (OTC) from liquids using CuO-M-CAB nanoparticles. By optimizing key parameters such as pH and reaction time through machine learning models (Tikhonov Regularization and PSO), removal efficiency is significantly enhanced. This method is applicable in water and liquid purification, particularly in pharmaceutical and environmental industries, to combat antibiotic contamination.