Mining the human phenome using allelic scores that index biological intermediates
St Pourcain, B
Public Library of Science
Copyright: © 2013 Evans et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
It is common practice in genome-wide association studies (GWAS) to focus on the relationship between disease risk and genetic variants one marker at a time. When relevant genes are identified it is often possible to implicate biological intermediates and pathways likely to be involved in disease aetiology. However, single genetic variants typically explain small amounts of disease risk. Our idea is to construct allelic scores that explain greater proportions of the variance in biological intermediates, and subsequently use these scores to data mine GWAS. To investigate the approach's properties, we indexed three biological intermediates where the results of large GWAS meta-analyses were available: body mass index, C-reactive protein and low density lipoprotein levels. We generated allelic scores in the Avon Longitudinal Study of Parents and Children, and in publicly available data from the first Wellcome Trust Case Control Consortium. We compared the explanatory ability of allelic scores in terms of their capacity to proxy for the intermediate of interest, and the extent to which they associated with disease. We found that allelic scores derived from known variants and allelic scores derived from hundreds of thousands of genetic markers explained significant portions of the variance in biological intermediates of interest, and many of these scores showed expected correlations with disease. Genome-wide allelic scores however tended to lack specificity suggesting that they should be used with caution and perhaps only to proxy biological intermediates for which there are no known individual variants. Power calculations confirm the feasibility of extending our strategy to the analysis of tens of thousands of molecular phenotypes in large genome-wide meta-analyses. We conclude that our method represents a simple way in which potentially tens of thousands of molecular phenotypes could be screened for causal relationships with disease without having to expensively measure these variables in individual disease collections.
The UK Medical Research Council (grant 74882), the Wellcome Trust (grant 076467) and the University of Bristol provide core support for ALSPAC. We thank 23andMe for funding the genotyping of the ALSPAC children's sample. Funding to pay the Open Access publication charges for this article was provided by the Wellcome Trust. MJAB was funded by the Wellcome Trust (grant 085515) and Leducq Foundation. JPK is funded by a Wellcome Trust grant (WT083431MA). AD is supported by NWO grant (veni, 916.12.154) and the EUR Fellowship. The QIMR Twin-Family Studies were supported by NIH grants (AA07535, AA07728, AA13320, AA13321, AA13326, AA14041, AA11998, AA17688, DA012854, DA019951); by grants from the Australian National Health and Medical Research Council (241944, 339462, 389927, 389875, 389891, 389892, 389938, 442915, 442981, 496739, 552485, 552498); from the Australian Research Council (A7960034, A79906588, A79801419, DP0770096, DP0212016, DP0343921); and the FP-5 GenomEUtwin Project (QLG2-CT-2002-01254). Genotyping was partially supported by grant AA13320 to the late Richard Todd, PhD, MD. GWM is supported by the National Health and Medical Research Council (NHMRC) Fellowship Scheme. The research leading to this work has received funding from the EU 7th Framework Programme under grant agreement number 247642, GEoCoDE. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
This is the final version of the article. Available from Public Library of Science via the DOI in this record.
Vol. 9 (10), article e1003919
Place of publication