Powerful SNP-Set Analysis for Case-Control Genome-wide Association Studies

Michael C. Wu; Peter Kraft; Michael P. Epstein; Deanne M. Taylor; Stephen J. Chanock; David J. Hunter; Xihong Lin

doi:10.1016/j.ajhg.2010.05.002

Powerful SNP-Set Analysis for Case-Control Genome-wide Association Studies

Michael C. Wu, Peter Kraft, Michael P. Epstein, Deanne M. Taylor, Stephen J. Chanock, David J. Hunter, Xihong Lin

Research output: Contribution to journal › Article › peer-review

392 Scopus citations

Abstract

GWAS have emerged as popular tools for identifying genetic variants that are associated with disease risk. Standard analysis of a case-control GWAS involves assessing the association between each individual genotyped SNP and disease risk. However, this approach suffers from limited reproducibility and difficulties in detecting multi-SNP and epistatic effects. As an alternative analytical strategy, we propose grouping SNPs together into SNP sets on the basis of proximity to genomic features such as genes or haplotype blocks, then testing the joint effect of each SNP set. Testing of each SNP set proceeds via the logistic kernel-machine-based test, which is based on a statistical framework that allows for flexible modeling of epistatic and nonlinear SNP effects. This flexibility and the ability to naturally adjust for covariate effects are important features of our test that make it appealing in comparison to individual SNP tests and existing multimarker tests. Using simulated data based on the International HapMap Project, we show that SNP-set testing can have improved power over standard individual-SNP analysis under a wide range of settings. In particular, we find that our approach has higher power than individual-SNP analysis when the median correlation between the disease-susceptibility variant and the genotyped SNPs is moderate to high. When the correlation is low, both individual-SNP analysis and the SNP-set analysis tend to have low power. We apply SNP-set analysis to analyze the Cancer Genetic Markers of Susceptibility (CGEMS) breast cancer GWAS discovery-phase data.

Original language	English (US)
Pages (from-to)	929-942
Number of pages	14
Journal	American journal of human genetics
Volume	86
Issue number	6
DOIs	https://doi.org/10.1016/j.ajhg.2010.05.002
State	Published - Jul 11 2010
Externally published	Yes

ASJC Scopus subject areas

Genetics
Genetics(clinical)

Access to Document

10.1016/j.ajhg.2010.05.002

Cite this

@article{b9abc18119294ae08dda33e128f5cfbf,

title = "Powerful SNP-Set Analysis for Case-Control Genome-wide Association Studies",

abstract = "GWAS have emerged as popular tools for identifying genetic variants that are associated with disease risk. Standard analysis of a case-control GWAS involves assessing the association between each individual genotyped SNP and disease risk. However, this approach suffers from limited reproducibility and difficulties in detecting multi-SNP and epistatic effects. As an alternative analytical strategy, we propose grouping SNPs together into SNP sets on the basis of proximity to genomic features such as genes or haplotype blocks, then testing the joint effect of each SNP set. Testing of each SNP set proceeds via the logistic kernel-machine-based test, which is based on a statistical framework that allows for flexible modeling of epistatic and nonlinear SNP effects. This flexibility and the ability to naturally adjust for covariate effects are important features of our test that make it appealing in comparison to individual SNP tests and existing multimarker tests. Using simulated data based on the International HapMap Project, we show that SNP-set testing can have improved power over standard individual-SNP analysis under a wide range of settings. In particular, we find that our approach has higher power than individual-SNP analysis when the median correlation between the disease-susceptibility variant and the genotyped SNPs is moderate to high. When the correlation is low, both individual-SNP analysis and the SNP-set analysis tend to have low power. We apply SNP-set analysis to analyze the Cancer Genetic Markers of Susceptibility (CGEMS) breast cancer GWAS discovery-phase data.",

author = "Wu, {Michael C.} and Peter Kraft and Epstein, {Michael P.} and Taylor, {Deanne M.} and Chanock, {Stephen J.} and Hunter, {David J.} and Xihong Lin",

note = "Funding Information: This work was sponsored by National Institutes of Health grants CA76404 and CA134294 (to X.L.) and HG003618 (to M.P.E.). ",

year = "2010",

month = jul,

day = "11",

doi = "10.1016/j.ajhg.2010.05.002",

language = "English (US)",

volume = "86",

pages = "929--942",

journal = "American journal of human genetics",

issn = "0002-9297",

publisher = "Cell Press",

number = "6",

}

TY - JOUR

T1 - Powerful SNP-Set Analysis for Case-Control Genome-wide Association Studies

AU - Wu, Michael C.

AU - Kraft, Peter

AU - Epstein, Michael P.

AU - Taylor, Deanne M.

AU - Chanock, Stephen J.

AU - Hunter, David J.

AU - Lin, Xihong

N1 - Funding Information: This work was sponsored by National Institutes of Health grants CA76404 and CA134294 (to X.L.) and HG003618 (to M.P.E.).

PY - 2010/7/11

Y1 - 2010/7/11

N2 - GWAS have emerged as popular tools for identifying genetic variants that are associated with disease risk. Standard analysis of a case-control GWAS involves assessing the association between each individual genotyped SNP and disease risk. However, this approach suffers from limited reproducibility and difficulties in detecting multi-SNP and epistatic effects. As an alternative analytical strategy, we propose grouping SNPs together into SNP sets on the basis of proximity to genomic features such as genes or haplotype blocks, then testing the joint effect of each SNP set. Testing of each SNP set proceeds via the logistic kernel-machine-based test, which is based on a statistical framework that allows for flexible modeling of epistatic and nonlinear SNP effects. This flexibility and the ability to naturally adjust for covariate effects are important features of our test that make it appealing in comparison to individual SNP tests and existing multimarker tests. Using simulated data based on the International HapMap Project, we show that SNP-set testing can have improved power over standard individual-SNP analysis under a wide range of settings. In particular, we find that our approach has higher power than individual-SNP analysis when the median correlation between the disease-susceptibility variant and the genotyped SNPs is moderate to high. When the correlation is low, both individual-SNP analysis and the SNP-set analysis tend to have low power. We apply SNP-set analysis to analyze the Cancer Genetic Markers of Susceptibility (CGEMS) breast cancer GWAS discovery-phase data.

AB - GWAS have emerged as popular tools for identifying genetic variants that are associated with disease risk. Standard analysis of a case-control GWAS involves assessing the association between each individual genotyped SNP and disease risk. However, this approach suffers from limited reproducibility and difficulties in detecting multi-SNP and epistatic effects. As an alternative analytical strategy, we propose grouping SNPs together into SNP sets on the basis of proximity to genomic features such as genes or haplotype blocks, then testing the joint effect of each SNP set. Testing of each SNP set proceeds via the logistic kernel-machine-based test, which is based on a statistical framework that allows for flexible modeling of epistatic and nonlinear SNP effects. This flexibility and the ability to naturally adjust for covariate effects are important features of our test that make it appealing in comparison to individual SNP tests and existing multimarker tests. Using simulated data based on the International HapMap Project, we show that SNP-set testing can have improved power over standard individual-SNP analysis under a wide range of settings. In particular, we find that our approach has higher power than individual-SNP analysis when the median correlation between the disease-susceptibility variant and the genotyped SNPs is moderate to high. When the correlation is low, both individual-SNP analysis and the SNP-set analysis tend to have low power. We apply SNP-set analysis to analyze the Cancer Genetic Markers of Susceptibility (CGEMS) breast cancer GWAS discovery-phase data.

UR - http://www.scopus.com/inward/record.url?scp=77953121307&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77953121307&partnerID=8YFLogxK

U2 - 10.1016/j.ajhg.2010.05.002

DO - 10.1016/j.ajhg.2010.05.002

M3 - Article

C2 - 20560208

AN - SCOPUS:77953121307

SN - 0002-9297

VL - 86

SP - 929

EP - 942

JO - American journal of human genetics

JF - American journal of human genetics

IS - 6

ER -

Powerful SNP-Set Analysis for Case-Control Genome-wide Association Studies

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this