Moreido, Vsevolod; Gartsman, Boris; Solomatine, Dimitri P.; Suchilina, Zoya. 2021. How Well Can Machine Learning Models Perform without Hydrologists? Application of Rational Feature Selection to Improve Hydrological Forecasting. Water 13, no. 12: 1696.
https://doi.org/10.3390/w13121696
With more machine learning methods being involved in social and environmental research activities, we are addressing the role of available information for model training in model performance. We tested the abilities of several machine learning models for short-term hydrological forecasting by inferring linkages with all available predictors or only with those pre-selected by a hydrologist. The models used in this study were multivariate linear regression, the M5 model tree, multilayer perceptron (MLP) artificial neural network, and the long short-term memory (LSTM) model. We used two river catchments in contrasting runoff generation conditions to try to infer the ability of different model structures to automatically select the best predictor set from all those available in the dataset and compared models’ performance with that of a model operating on predictors prescribed by a hydrologist. Additionally, we tested how shuffling of the initial dataset improved model performance. We can conclude that in rainfall-driven catchments, the models performed generally better on a dataset prescribed by a hydrologist, while in mixed-snowmelt and baseflow-driven catchments, the automatic selection of predictors was preferable.
water-13-01696-v2.pdf
With more machine learning methods being involved in social and environmental research activities, we are addressing the role of available information for model training in model performance. We tested the abilities of several machine learning models for short-term hydrological forecasting by inferring linkages with all available predictors or only with those pre-selected by a hydrologist. The models used in this study were multivariate linear regression, the M5 model tree, multilayer perceptron (MLP) artificial neural network, and the long short-term memory (LSTM) model. We used two river catchments in contrasting runoff generation conditions to try to infer the ability of different model structures to automatically select the best predictor set from all those available in the dataset and compared models’ performance with that of a model operating on predictors prescribed by a hydrologist. Additionally, we tested how shuffling of the initial dataset improved model performance. We can conclude that in rainfall-driven catchments, the models performed generally better on a dataset prescribed by a hydrologist, while in mixed-snowmelt and baseflow-driven catchments, the automatic selection of predictors was preferable.
water-13-01696-v2.pdf
Дата публикации: 2021