Testing SNPs and sets of SNPs for importance in association studies

Holger Schwender, Ingo Ruczinski, Katja Ickstadt

Research output: Contribution to journalArticle

Abstract

A major goal of genetic association studies concerned with single nucleotide polymorphisms (SNPs) is the detection of SNPs exhibiting an impact on the risk of developing a disease. Typically, this problem is approached by testing each of the SNPs individually. This, however, can lead to an inaccurate measurement of the influence of the SNPs on the disease risk, in particular, if SNPs only show an effect when interacting with other SNPs, as the multivariate structure of the data is ignored. In this article, we propose a testing procedure based on logic regression that takes this structure into account and therefore enables a more appropriate quantification of importance and ranking of the SNPs than marginal testing. Since even SNP interactions often exhibit only a moderate effect on the disease risk, it can be helpful to also consider sets of SNPs (e.g. SNPs belonging to the same gene or pathway) to borrow strength across these SNP sets and to identify those genes or pathways comprising SNPs that are most consistently associated with the response. We show how the proposed procedure can be adapted for testing SNP sets, and how it can be applied to blocks of SNPs in linkage disequilibrium (LD) to overcome problems caused by LD.

Original languageEnglish (US)
Pages (from-to)18-32
Number of pages15
JournalBiostatistics
Volume12
Issue number1
DOIs
StatePublished - Jan 2011

Fingerprint

Single nucleotide Polymorphism
Single Nucleotide Polymorphism
Testing
Linkage Disequilibrium
Polymorphism
Pathway
Gene
Genetic Association
Genetic Association Studies
Inaccurate
Quantification
Genes
Ranking

Keywords

  • Feature selection
  • GENICA
  • Importance measure
  • Logic regression
  • logicFS

ASJC Scopus subject areas

  • Medicine(all)
  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Testing SNPs and sets of SNPs for importance in association studies. / Schwender, Holger; Ruczinski, Ingo; Ickstadt, Katja.

In: Biostatistics, Vol. 12, No. 1, 01.2011, p. 18-32.

Research output: Contribution to journalArticle

Schwender, Holger ; Ruczinski, Ingo ; Ickstadt, Katja. / Testing SNPs and sets of SNPs for importance in association studies. In: Biostatistics. 2011 ; Vol. 12, No. 1. pp. 18-32.
@article{2123e342f8cb4709aeb23a1e2616e0b3,
title = "Testing SNPs and sets of SNPs for importance in association studies",
abstract = "A major goal of genetic association studies concerned with single nucleotide polymorphisms (SNPs) is the detection of SNPs exhibiting an impact on the risk of developing a disease. Typically, this problem is approached by testing each of the SNPs individually. This, however, can lead to an inaccurate measurement of the influence of the SNPs on the disease risk, in particular, if SNPs only show an effect when interacting with other SNPs, as the multivariate structure of the data is ignored. In this article, we propose a testing procedure based on logic regression that takes this structure into account and therefore enables a more appropriate quantification of importance and ranking of the SNPs than marginal testing. Since even SNP interactions often exhibit only a moderate effect on the disease risk, it can be helpful to also consider sets of SNPs (e.g. SNPs belonging to the same gene or pathway) to borrow strength across these SNP sets and to identify those genes or pathways comprising SNPs that are most consistently associated with the response. We show how the proposed procedure can be adapted for testing SNP sets, and how it can be applied to blocks of SNPs in linkage disequilibrium (LD) to overcome problems caused by LD.",
keywords = "Feature selection, GENICA, Importance measure, Logic regression, logicFS",
author = "Holger Schwender and Ingo Ruczinski and Katja Ickstadt",
year = "2011",
month = "1",
doi = "10.1093/biostatistics/kxq042",
language = "English (US)",
volume = "12",
pages = "18--32",
journal = "Biostatistics",
issn = "1465-4644",
publisher = "Oxford University Press",
number = "1",

}

TY - JOUR

T1 - Testing SNPs and sets of SNPs for importance in association studies

AU - Schwender, Holger

AU - Ruczinski, Ingo

AU - Ickstadt, Katja

PY - 2011/1

Y1 - 2011/1

N2 - A major goal of genetic association studies concerned with single nucleotide polymorphisms (SNPs) is the detection of SNPs exhibiting an impact on the risk of developing a disease. Typically, this problem is approached by testing each of the SNPs individually. This, however, can lead to an inaccurate measurement of the influence of the SNPs on the disease risk, in particular, if SNPs only show an effect when interacting with other SNPs, as the multivariate structure of the data is ignored. In this article, we propose a testing procedure based on logic regression that takes this structure into account and therefore enables a more appropriate quantification of importance and ranking of the SNPs than marginal testing. Since even SNP interactions often exhibit only a moderate effect on the disease risk, it can be helpful to also consider sets of SNPs (e.g. SNPs belonging to the same gene or pathway) to borrow strength across these SNP sets and to identify those genes or pathways comprising SNPs that are most consistently associated with the response. We show how the proposed procedure can be adapted for testing SNP sets, and how it can be applied to blocks of SNPs in linkage disequilibrium (LD) to overcome problems caused by LD.

AB - A major goal of genetic association studies concerned with single nucleotide polymorphisms (SNPs) is the detection of SNPs exhibiting an impact on the risk of developing a disease. Typically, this problem is approached by testing each of the SNPs individually. This, however, can lead to an inaccurate measurement of the influence of the SNPs on the disease risk, in particular, if SNPs only show an effect when interacting with other SNPs, as the multivariate structure of the data is ignored. In this article, we propose a testing procedure based on logic regression that takes this structure into account and therefore enables a more appropriate quantification of importance and ranking of the SNPs than marginal testing. Since even SNP interactions often exhibit only a moderate effect on the disease risk, it can be helpful to also consider sets of SNPs (e.g. SNPs belonging to the same gene or pathway) to borrow strength across these SNP sets and to identify those genes or pathways comprising SNPs that are most consistently associated with the response. We show how the proposed procedure can be adapted for testing SNP sets, and how it can be applied to blocks of SNPs in linkage disequilibrium (LD) to overcome problems caused by LD.

KW - Feature selection

KW - GENICA

KW - Importance measure

KW - Logic regression

KW - logicFS

UR - http://www.scopus.com/inward/record.url?scp=78650647757&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78650647757&partnerID=8YFLogxK

U2 - 10.1093/biostatistics/kxq042

DO - 10.1093/biostatistics/kxq042

M3 - Article

C2 - 20601626

AN - SCOPUS:78650647757

VL - 12

SP - 18

EP - 32

JO - Biostatistics

JF - Biostatistics

SN - 1465-4644

IS - 1

ER -