Robust spatio-temporal latent variable models
Thesis or dissertation
University of Exeter
Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are widely-used mathematical models for decomposing multivariate data. They capture spatial relationships between variables, but ignore any temporal relationships that might exist between observations. Probabilistic PCA (PPCA) and Probabilistic CCA (ProbCCA) are versions of these two models that explain the statistical properties of the observed variables as linear mixtures of an alternative, hypothetical set of hidden, or latent, variables and explicitly model noise. Both the noise and the latent variables are assumed to be Gaussian distributed. This thesis introduces two new models, named PPCA-AR and ProbCCA-AR, that augment PPCA and ProbCCA respectively with autoregressive processes over the latent variables to additionally capture temporal relationships between the observations. To make PPCA-AR and ProbCCA-AR robust to outliers and able to model leptokurtic data, the Gaussian assumptions are replaced with infinite scale mixtures of Gaussians, using the Student-t distribution. Bayesian inference calculates posterior probability distributions for each of the parameter variables, from which we obtain a measure of confidence in the inference. It avoids the pitfalls associated with the maximum likelihood method: integrating over all possible values of the parameter variables guards against overfitting. For these new models the integrals required for exact Bayesian inference are intractable; instead a method of approximation, the variational Bayesian approach, is used. This enables the use of automatic relevance determination to estimate the model orders. PPCA-AR and ProbCCA-AR can be viewed as linear dynamical systems, so the forward-backward algorithm, also known as the Baum-Welch algorithm, is used as an efficient method for inferring the posterior distributions of the latent variables. The exact algorithm is tractable because Gaussian assumptions are made regarding the distribution of the latent variables. This thesis introduces a variational Bayesian forward-backward algorithm based on Student-t assumptions. The new models are demonstrated on synthetic datasets and on real remote sensing and EEG data.
Christmas, J. and Everson, R. (2010). Temporally coupled Principal Component Analysis: a probabilistic autoregression method. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Barcelona, 19th-23rd July 2010.
Christmas, J. and Everson, R. (2011). Robust autoregression: Student-t innovations using variational Bayes. IEEE Transactions on Signal Processing, 59(1):48-57.
PhD in Computer Science