Using duplicate genotyped data in genetic analyses: Testing association and estimating error rates

Nathan L. Tintle, Derek Gordon, Francis J. McMahon, Stephen J. Finch

Research output: Contribution to journalArticlepeer-review


Although researchers use duplicate genotyped data to calculate an inconsistency rate, there is no power analysis to assess the value of the duplicate data. In this paper, we present a model in which the genotyping error rate is related to the inconsistency rate. We extend the g genotype by h phenotype chi-squared test to incorporate the duplicate genotyped data. When a subject is inconsistently genotyped (that is, has two observed genotypes), our procedure is to allocate 0.5 units to each of the two genotypes. We specify the multivariate analysis of variance (MANOVA) test comparing these extended counts. We provide freely available software for this test and also for a permutation test used on small samples. A simulation study shows that the asymptotic null distribution of the MANOVA test holds when the total number of subjects, N, is at least 300. We also document with a simulation study that the asymptotic distribution of this test under various alternative hypotheses is a satisfactory approximation to the simulated power. In all cases, the power of the MANOVA test using the duplicate genotyped data is greater than the power of the chi-squared test ignoring the duplicate data. Power increases ranged from 0.776% to 4.652% for 80% powered tests and 0.292% to 2.028% for 95% powered tests. Researchers now can compute the value of the duplicate genotyped data as part of the design of the study.

Original languageEnglish (US)
Article number4
JournalStatistical applications in genetics and molecular biology
Issue number1
StatePublished - Feb 5 2007
Externally publishedYes


  • Case-control
  • Duplicate genotype
  • Genome wide association
  • Genotype error
  • Inconsistency rate
  • Misclassification
  • Re-genotype
  • Test of association
  • Whole genome association

ASJC Scopus subject areas

  • Statistics and Probability
  • Molecular Biology
  • Genetics
  • Computational Mathematics


Dive into the research topics of 'Using duplicate genotyped data in genetic analyses: Testing association and estimating error rates'. Together they form a unique fingerprint.

Cite this