Benchmarking the Performance of Homogenisation Algorithms on Daily Temperature Data
Killick, Rachel Elizabeth
Thesis or dissertation
University of Exeter
Reliable temperature time series are necessary to quantify how our world is changing. Unfortunately many non-climatic artefacts, known as inhomogeneities, affect these time series. When looking at real world data it is often not possible to distinguish between these non-climatic artefacts and true climatic variations that are naturally found in our world. Therefore, trying to remove the non-climatic artefacts with complete confidence is problematic, but leaving them in could lead to misinterpretation of climate variations. In creating realistic, homogeneous, synthetic, daily temperature series the truth can be known about the data completely. Known, created inhomogeneity structures can be added to these series, allowing the distinguishing between true and artificial artefacts. The application of homogenisation algorithms to these created inhomogeneous data allows the assessment of algorithm performance, as their returned contributions are being compared to a known standard or benchmark, the clean data. In this work a Generalised Additive Model (GAM) was used to create synthetic, clean, daily temperature series. Daily data pose new challenges compared to monthly or annual data owing to their increased variability and quantity. This is the first intercomparison study to assess homogenisation algorithm performance on temperature data at the daily level. The inhomogeneity structures added to the clean data were created by perturbing the inputs to the GAM, which created seasonally varying inhomogeneities, and by adding constant offsets, which created constant inhomogeneities. Four different regions in the United States were modelled, these four regions are climatically diverse which allowed for the exploration of the impact of this on homogenisation algorithm performance. Four different data scenarios, incorporating three different inhomogeneity structures, were added and evaluations also investigated how these impacted algorithm performance. Eight homogenisation algorithms were contributed to this study and their performance was assessed according to both their ability to detect change points and their ability to return series that were closer to the clean data than they were on release. These evaluations sought to aid the improvement of these algorithms and enable a quantification of the uncertainty remaining in daily temperature data even after homogenisation has taken place. Evaluations were also made of the benchmarks as it was important that benchmark weaknesses were taken into account. It was found that more climatologically diverse regions were harder to model and less climatologically diverse regions were easier to homogenise. Station density in a network and the presence of artificial trend inhomogeneities did not impact algorithm performance as much as changes in autocorrelations did, and the latter area was an area that most algorithms could improve on. This work feeds into the larger project of the International Surface Temperature Initiative which is working on a wider scale and with monthly instead of daily data.
National Meteorological Office, UK - CASE funding
Engineering and Physical Sciences Research Council
PhD in Mathematics