posted on 2025-08-02, 13:02authored byJM Salter, TJ McKinley, X Xiong, DB Williamson
Computer models are used to study the real-world, and often contain a large number of uncertain input parameters, produce a large number of outputs, may be expensive to run, and need calibrating to real-world observations in order to be useful for decision-making. Emulators are often used as cheap surrogates for the expensive simulator, trained on a small number of simulations to provide predictions with uncertainty at unseen inputs. In epidemiological applications, for example compartmental or agent-based models for modelling the spread of infectious diseases, the output is usually spatially and temporally indexed, stochastic, and consists of counts rather than continuous variables. Here, we consider emulating high-dimensional count output from a complex computer model using a Poisson Lognormal PCA (PLNPCA) emulator. We apply the PLNPCA emulator to output fields from a Covid-19 model for England and Wales and compare this to fitting emulators to aggregations of the full output. We show that performance is generally comparable, whilst the PLNPCA emulator inherits desirable properties, including allowing the full output to be predicted whilst capturing correlations between outputs, providing high-dimensional samples of counts that are representative of the true model output.
Funding
EP/V051555/1
Engineering and Physical Sciences Research Council (EPSRC)
This is the final version. Available on open access from the Royal Society via the DOI in this record
Data Accessibility. Data and code are available at https://github.com/JSalter90/CountBasis