The performance of an ensemble forecast, as measured by scoring rules, depends on its number of members. Under the assumption of ensemble member exchangeability, ensemble-adjusted scores provide unbiased estimates of the ensemble-size effect. In this study, the concept of ensemble-adjusted scores is revisited and exploited in the general ...
The performance of an ensemble forecast, as measured by scoring rules, depends on its number of members. Under the assumption of ensemble member exchangeability, ensemble-adjusted scores provide unbiased estimates of the ensemble-size effect. In this study, the concept of ensemble-adjusted scores is revisited and exploited in the general context of multi-model ensemble forecasting. In particular, an ensemblesize adjustment is proposed for the continuous ranked probability score in a multi-model ensemble setting. The method requires that the ensemble forecasts satisfy generalized multi-model exchangeability
conditions. These conditions do not require the models themselves to be exchangeable.
The adjusted scores are tested here on a dual-resolution ensemble, an ensemble which combines
members drawn from the same numerical model but run at two different grid resolutions. It is shown
that performance of different ensemble combinations can be robustly estimated based on a small subset
of members from each model. At no additional cost, the ensemble-size effect is investigated not only
considering the pooling of potential extra-members but also including the impact of optimal weighting
strategies. With simple and efficient tools, the proposed methodology paves the way for predictive
verification of multi-model ensemble forecasts; the derived statistics can provide guidance for the
design of future operational ensemble configurations without having to run additional ensemble forecast
experiments for all the potential configurations.