Classification and regression machine learning models for predicting mixed toxicity of carbamazepine and its transformation products.
Journal:
Environmental research
PMID:
39929412
Abstract
Carbamazepine (CBZ) and its transformation products (TPs) often occur in aquatic environments in the form of mixtures, posing potential risks to ecosystems. However, establishing standardized protocols for synthesizing, isolating, and acquiring these TPs has been challenging, leading to difficulty in obtaining toxicity data. Accurately assessing the risks associated with mixed toxicity of TPs was therefore critical. The research evaluated the binary toxicity of CBZ and its TPs using luminescent bacteria. The mixed toxicity of TPs showed simply additive effects. In order to comprehend the connection between the toxicity of TPs and CBZ, we labeled TPs with toxicity higher than CBZ as 'high risk' and TPs with lower toxicity as 'low risk.' Subsequently, we developed and tested seven classification models and five regression models. The classification models were of guiding significance for the management of toxicity risk. In contrast, the regression models are capable of addressing the lack of mixed toxicity data. However, these regression models show limitations on the experimental datasets, and their performance is unsatisfactory. Considering the challenges in obtaining toxicity data of transformed products, we addressed this limitation by enhancing the dataset using generative adversarial networks (GANs), thereby improving the generalization capability about the regression models. This study highlighted the potential of quantitative structure-activity relationship (QSAR) models, which are based on based on machine learning for predicting the mixed toxicity of TPs, providing a solution for toxicity assessment without chemical standards.