Multi-Objective ROC learning for classification
Clark, Andrew Robert James
Thesis or dissertation
University of Exeter
Receiver operating characteristic (ROC) curves are widely used for evaluating classifier performance, having been applied to e.g. signal detection, medical diagnostics and safety critical systems. They allow examination of the trade-offs between true and false positive rates as misclassification costs are varied. Examination of the resulting graphs and calcu- lation of the area under the ROC curve (AUC) allows assessment of how well a classifier is able to separate two classes and allows selection of an operating point with full knowledge of the available trade-offs. In this thesis a multi-objective evolutionary algorithm (MOEA) is used to find clas- sifiers whose ROC graph locations are Pareto optimal. The Relevance Vector Machine (RVM) is a state-of-the-art classifier that produces sparse Bayesian models, but is unfor- tunately prone to overfitting. Using the MOEA, hyper-parameters for RVM classifiers are set, optimising them not only in terms of true and false positive rates but also a novel measure of RVM complexity, thus encouraging sparseness, and producing approximations to the Pareto front. Several methods for regularising the RVM during the MOEA train- ing process are examined and their performance evaluated on a number of benchmark datasets demonstrating they possess the capability to avoid overfitting whilst producing performance equivalent to that of the maximum likelihood trained RVM. A common task in bioinformatics is to identify genes associated with various genetic conditions by finding those genes useful for classifying a condition against a baseline. Typ- ically, datasets contain large numbers of gene expressions measured in relatively few sub- jects. As a result of the high dimensionality and sparsity of examples, it can be very easy to find classifiers with near perfect training accuracies but which have poor generalisation capability. Additionally, depending on the condition and treatment involved, evaluation over a range of costs will often be desirable. An MOEA is used to identify genes for clas- sification by simultaneously maximising the area under the ROC curve whilst minimising model complexity. This method is illustrated on a number of well-studied datasets and ap- plied to a recent bioinformatics database resulting from the current InChianti population study. Many classifiers produce “hard”, non-probabilistic classifications and are trained to find a single set of parameters, whose values are inevitably uncertain due to limited available training data. In a Bayesian framework it is possible to ameliorate the effects of this parameter uncertainty by averaging over classifiers weighted by their posterior probabil- ity. Unfortunately, the required posterior probability is not readily computed for hard classifiers. In this thesis an Approximate Bayesian Computation Markov Chain Monte Carlo algorithm is used to sample model parameters for a hard classifier using the AUC as a measure of performance. The ability to produce ROC curves close to the Bayes op- timal ROC curve is demonstrated on a synthetic dataset. Due to the large numbers of sampled parametrisations, averaging over them when rapid classification is needed may be impractical and thus methods for producing sparse weightings are investigated.
Clark, A. and Everson, R. (2011). Evolving sparse multi-resolution RVM classifiers. In Dupenois, M. and Walker, D., editors, Proceedings of the 2nd Postgraduate Confer- ence for Computing: Applications and Theory (PCCAT 2011), pages 53 – 60, Exeter, UK. PCCAT, College of Engineering, Mathematics and Physical Sciences, University of Exeter.
Clark, A. and Everson, R. (2011). Multi-objective learning of relevance vector machine classifiers with multi-resolution kernels. Pattern Recognition, available online 7 March 2012 (http://www.sciencedirect.com/science/article/pii/S0031320312001033)
PhD in Computer Science