River Water Salinity Prediction Using Hybrid Machine Learning Models




Melesse, Assefa M.
Khosravi, Khabat
Tiefenbacher, John
Heddam, Salim
Kim, Sungwon
Mosavi, Amir
Pham, Binh Thai

Journal Title

Journal ISSN

Volume Title


Multidisciplinary Digital Publishing Institute


Electrical conductivity (EC), one of the most widely used indices for water quality assessment, has been applied to predict the salinity of the Babol-Rood River, the greatest source of irrigation water in northern Iran. This study uses two individual—M5 Prime (M5P) and random forest (RF)—and eight novel hybrid algorithms—bagging-M5P, bagging-RF, random subspace (RS)-M5P, RS-RF, random committee (RC)-M5P, RC-RF, additive regression (AR)-M5P, and AR-RF—to predict EC. Thirty-six years of observations collected by the Mazandaran Regional Water Authority were randomly divided into two sets: 70% from the period 1980 to 2008 was used as model-training data and 30% from 2009 to 2016 was used as testing data to validate the models. Several water quality variables—pH, HCO3‾, CI‾, SO₄²⁻, Na⁺, Mg²⁺, Ca²⁺, river discharge (Q), and total dissolved solids (TDS)—were modeling inputs. Using EC and the correlation coefficients (CC) of the water quality variables, a set of nine input combinations were established. TDS, the most effective input variable, had the highest EC-CC (r = 0.91), and it was also determined to be the most important input variable among the input combinations. All models were trained and each model’s prediction power was evaluated with the testing data. Several quantitative criteria and visual comparisons were used to evaluate modeling capabilities. Results indicate that, in most cases, hybrid algorithms enhance individual algorithms’ predictive powers. The AR algorithm enhanced both M5P and RF predictions better than bagging, RS, and RC. M5P performed better than RF. Further, AR-M5P outperformed all other algorithms (R² = 0.995, RMSE = 8.90 μs/cm, MAE = 6.20 μs/cm, NSE = 0.994 and PBIAS = -0.042). The hybridization of machine learning methods has significantly improved model performance to capture maximum salinity values, which is essential in water resource management.



water salinity, machine learning, bagging, random forest, random subspace, data science, hydrological model, hydroinformatics, electrical conductivity, Geography and Environmental Studies


Melesse, A. M., Khosravi, K., Tiefenbacher, J. P., Heddam, S., Kim, S., Mosavi, A., & Pham, B. T. (2020). River Water Salinity Prediction Using Hybrid Machine Learning Models. Water, 12(10), 2951.


Rights Holder

© 2020 The Authors.

Rights License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Rights URI