Selection of relevant input variables in stormwater quality modelling by multi-objective evolutionary polynomial regression paradigm
Water Resources Research
The growing availability of field data, from information and communication technologies (ICTs) in “smart” urban infrastructures, increasingly allows data-modelling to understand complex phenomena and to support management decisions. Among the analysed phenomena, those related to stormwater quality modelling have recently been gaining interest in the scientific literature. Nonetheless, the large amount of available data poses the problem of selecting relevant variables to describe a phenomenon and enable robust data-modelling. This paper presents a procedure for the selection of relevant input variables using the multi-objective evolutionary polynomial regression (EPR-MOGA) paradigm. The procedure is based on scrutinizing the explanatory variables that appear inside the set of EPR-MOGA symbolic model expressions of increasing complexity and goodness of fit to target output. The strategy also enables the selection to be validated by engineering judgement. In such context, the multiple case study extension of EPR-MOGA, called MCS-EPR-MOGA, is adopted. The application of the proposed procedure to modelling storm water quality parameters in two French catchments shows that it was able to significantly reduce the number of explanatory variables for successive analyses. Finally, the EPR-MOGA models obtained after the input selection are compared with those obtained by using the same technique without benefitting from input selection and with those obtained in previous works where other data-modelling techniques were used on the same data. The comparison highlights the effectiveness of both EPR-MOGA and the input selection procedure.
This is the author accepted manuscript. The final version is available from Wiley via the DOI in this record.
Published online 4 March 2016