Integrating Regression and Boosting Techniques for Enhanced River Water Quality Monitoring in the Cauvery Basin: A Seasonal and Sustainable Approach.
Journal:
Water environment research : a research publication of the Water Environment Federation
Published Date:
Jul 1, 2025
Abstract
This study addresses a critical research gap in water quality monitoring, specifically within the Cauvery River basin, where substantial contamination poses significant risks to both human health and aquatic ecosystems. The paper introduces an effective and sustainable river water quality monitoring system, termed MLRMC-WQM (Multiple Linear Regression and Multi-class CatBoost-based Water Quality Monitoring). The system leverages Linear Regression to predict basic water quality parameters based on straightforward relationships, while CatBoost refines these predictions by capturing more complex, nonlinear relationships. Various sensors are integrated with a Raspberry Pi-5, which collects readings at regular intervals. The Raspberry Pi-5 is equipped with wireless communication modules to transmit real-time data to cloud servers, where the information is stored and processed. Cloud platforms provide scalability, security, and accessibility for efficient data management. By incorporating energy-efficient and scalable technologies, the system minimizes environmental impact while ensuring long-term sustainability. If the system detects abnormal levels of pollutants, turbidity, or other parameters, it triggers automated alerts via SMS, email, or app notifications. The effectiveness of the MLRMC-WQM model is assessed using regression metrics, including Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), R-squared (R), and Mean Squared Error (MSE) to assess the accuracy of parameter predictions, and classification metrics, such as accuracy, precision, and F1-score to evaluate the effectiveness of water quality categorization. A comparative analysis with three state-of-the-art methods demonstrates that the MLRMC-WQM model achieves a validation accuracy of 97.92%, outperforming the other methods. This study contributes a practical, technology-driven tool that bridges environmental science and decision-making. By enabling real-time, multi-faceted monitoring and promoting data-driven and timely interventions, the system supports sustainable water resource management, significantly enhancing efforts to conserve vital water resources and protect ecosystems. SUMMARY: A hybrid methodology has been proposed for effective river water quality monitoring. Real-time data collection has been conducted across multiple locations. Diverse water quality parameters have been measured and analyzed. Two distinct seasons have been analyzed to monitor water quality. The performance of MLRMC-WQM has been evaluated and compared with other machine learning techniques.