Incorporating genotype uncertainties into the genotypic TDT for main effects and gene-environment interactions.

Research output: Contribution to journalArticle

Abstract

Genotype imputation has become a standard option for researchers to expand their genotype datasets to improve signal precision and power in tests of genetic association with disease. In imputations for family-based studies however, subjects are often treated as unrelated individuals: currently, only BEAGLE allows for simultaneous imputation for trios of parents and offspring; however, only the most likely genotype calls are returned, not estimated genotype probabilities. For population-based SNP association studies, it has been shown that incorporating genotype uncertainty can be more powerful than using hard genotype calls. We here investigate this issue in the context of case-parent family data. We present the statistical framework for the genotypic transmission-disequilibrium test (gTDT) using observed genotype calls and imputed genotype probabilities, derive an extension to assess gene-environment interactions for binary environmental variables, and illustrate the performance of our method on a set of trios from the International Cleft Consortium. In contrast to population-based studies, however, utilizing the genotype probabilities in this framework (derived by treating the family members as unrelated) can result in biases of the test statistics toward protectiveness for the minor allele, particularly for markers with lower minor allele frequencies and lower imputation quality. We further compare the results between ignoring relatedness in the imputation and taking family structure into account, based on hard genotype calls. We find that by far the least biased results are obtained when family structure is taken into account and currently recommend this approach in spite of its intense computational requirements.

Original languageEnglish (US)
Pages (from-to)225-234
Number of pages10
JournalGenetic Epidemiology
Volume36
Issue number3
DOIs
StatePublished - Apr 2012

Fingerprint

Gene-Environment Interaction
Uncertainty
Genotype
Gene Frequency
Population
Single Nucleotide Polymorphism
Parents
Alleles
Research Personnel

ASJC Scopus subject areas

  • Genetics(clinical)
  • Epidemiology

Cite this

@article{ea1a20aa8a8746dca6aecc2e6bc4cb5c,
title = "Incorporating genotype uncertainties into the genotypic TDT for main effects and gene-environment interactions.",
abstract = "Genotype imputation has become a standard option for researchers to expand their genotype datasets to improve signal precision and power in tests of genetic association with disease. In imputations for family-based studies however, subjects are often treated as unrelated individuals: currently, only BEAGLE allows for simultaneous imputation for trios of parents and offspring; however, only the most likely genotype calls are returned, not estimated genotype probabilities. For population-based SNP association studies, it has been shown that incorporating genotype uncertainty can be more powerful than using hard genotype calls. We here investigate this issue in the context of case-parent family data. We present the statistical framework for the genotypic transmission-disequilibrium test (gTDT) using observed genotype calls and imputed genotype probabilities, derive an extension to assess gene-environment interactions for binary environmental variables, and illustrate the performance of our method on a set of trios from the International Cleft Consortium. In contrast to population-based studies, however, utilizing the genotype probabilities in this framework (derived by treating the family members as unrelated) can result in biases of the test statistics toward protectiveness for the minor allele, particularly for markers with lower minor allele frequencies and lower imputation quality. We further compare the results between ignoring relatedness in the imputation and taking family structure into account, based on hard genotype calls. We find that by far the least biased results are obtained when family structure is taken into account and currently recommend this approach in spite of its intense computational requirements.",
author = "Taub, {Margaret A.} and Holger Schwender and Beaty, {Terri H.} and Louis, {Thomas A.} and Ingo Ruczinski",
year = "2012",
month = "4",
doi = "10.1002/gepi.21615",
language = "English (US)",
volume = "36",
pages = "225--234",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "3",

}

TY - JOUR

T1 - Incorporating genotype uncertainties into the genotypic TDT for main effects and gene-environment interactions.

AU - Taub, Margaret A.

AU - Schwender, Holger

AU - Beaty, Terri H.

AU - Louis, Thomas A.

AU - Ruczinski, Ingo

PY - 2012/4

Y1 - 2012/4

N2 - Genotype imputation has become a standard option for researchers to expand their genotype datasets to improve signal precision and power in tests of genetic association with disease. In imputations for family-based studies however, subjects are often treated as unrelated individuals: currently, only BEAGLE allows for simultaneous imputation for trios of parents and offspring; however, only the most likely genotype calls are returned, not estimated genotype probabilities. For population-based SNP association studies, it has been shown that incorporating genotype uncertainty can be more powerful than using hard genotype calls. We here investigate this issue in the context of case-parent family data. We present the statistical framework for the genotypic transmission-disequilibrium test (gTDT) using observed genotype calls and imputed genotype probabilities, derive an extension to assess gene-environment interactions for binary environmental variables, and illustrate the performance of our method on a set of trios from the International Cleft Consortium. In contrast to population-based studies, however, utilizing the genotype probabilities in this framework (derived by treating the family members as unrelated) can result in biases of the test statistics toward protectiveness for the minor allele, particularly for markers with lower minor allele frequencies and lower imputation quality. We further compare the results between ignoring relatedness in the imputation and taking family structure into account, based on hard genotype calls. We find that by far the least biased results are obtained when family structure is taken into account and currently recommend this approach in spite of its intense computational requirements.

AB - Genotype imputation has become a standard option for researchers to expand their genotype datasets to improve signal precision and power in tests of genetic association with disease. In imputations for family-based studies however, subjects are often treated as unrelated individuals: currently, only BEAGLE allows for simultaneous imputation for trios of parents and offspring; however, only the most likely genotype calls are returned, not estimated genotype probabilities. For population-based SNP association studies, it has been shown that incorporating genotype uncertainty can be more powerful than using hard genotype calls. We here investigate this issue in the context of case-parent family data. We present the statistical framework for the genotypic transmission-disequilibrium test (gTDT) using observed genotype calls and imputed genotype probabilities, derive an extension to assess gene-environment interactions for binary environmental variables, and illustrate the performance of our method on a set of trios from the International Cleft Consortium. In contrast to population-based studies, however, utilizing the genotype probabilities in this framework (derived by treating the family members as unrelated) can result in biases of the test statistics toward protectiveness for the minor allele, particularly for markers with lower minor allele frequencies and lower imputation quality. We further compare the results between ignoring relatedness in the imputation and taking family structure into account, based on hard genotype calls. We find that by far the least biased results are obtained when family structure is taken into account and currently recommend this approach in spite of its intense computational requirements.

UR - http://www.scopus.com/inward/record.url?scp=84867606311&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867606311&partnerID=8YFLogxK

U2 - 10.1002/gepi.21615

DO - 10.1002/gepi.21615

M3 - Article

C2 - 22678881

AN - SCOPUS:84867606311

VL - 36

SP - 225

EP - 234

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 3

ER -