Gene ranking by bootstrapped P-values
Mukherjee, SN; Sykacek, P; Roberts, SJ; et al.Gurr, SJ
Date: 1 December 2003
Journal
SIGKDD Explorations : newsletter of the Special Interest Group (SIG) on Knowledge Discovery & Data Mining
Publisher
SIGKDD
Abstract
Recent research has shown that it is possible to find genes involved in the pathogenesis of a particular condition on the basis of microarray experiments. Genes which are differentially expressed, for example between healthy and diseased tissues, are likely to be relevant to the disease under study. Some of the properties of microarray ...
Recent research has shown that it is possible to find genes involved in the pathogenesis of a particular condition on the basis of microarray experiments. Genes which are differentially expressed, for example between healthy and diseased tissues, are likely to be relevant to the disease under study. Some of the properties of microarray datasets make the task
of finding these genes a challenging one. This paper proposes a gene-ranking algorithm whose main novelty is the use of bootstrapped P-values. We present an analysis of the algorithm, showing how it takes account of small-sample variability in observed values of the test statistic, in a way conventional statistical tests cannot. Experimental results show that our algorithm outperforms the widely-used two-sample t-test on challenging artificial data. Gene ranking is then performed on two well-known microarray datasets, with encouraging results. For example, a number of genes from one of the datasets, whose differential expression was subsequently confirmed by a more reliable biochemical analysis, are found to be ranked higher by the bootstrapped algorithm than by the conventional t-test, suggesting that the proposed algorithm may be better able to exploit the limited data available to infer biologically useful information.
Biosciences - old structure
Collections of Former Colleges
Item views 0
Full item downloads 0