Evaluation of self-reported ethnicity in a case-control population

The stroke prevention in young women study

Jesse B. Mez, John W. Cole, Timothy D. Howard, Leah R. MacClellan, Oscar C. Stine, Jeffery R. O'Connell, Marcella A. Wozniak, Barney Stern, John D. Sorkin, Braxton D. Mitchell, Steven J. Kittner

Research output: Contribution to journalArticle

Abstract

Background. Population-based association studies are used to identify common susceptibility variants for complex genetic traits. These studies are susceptible to confounding from unknown population substructure. Here we apply a model-based clustering approach to our case-control study of stroke among young women to examine if self-reported ethnicity can serve as a proxy for genetic ancestry. Findings. A population-based case-control study of stroke among women aged 15-49 identified 361 cases of first ischemic stroke and 401 age-comparable control subjects. Thirty single nucleotide polymorphisms (SNPs) throughout the genome unrelated to stroke risk and with established ancestry-based allele frequency differences were genotyped in all participants. The Structure program was used to iteratively evaluate for K = 1 to 5 potential genetic-based subpopulations. Evaluating the population as a whole, the Structure output plateaued at K = 2 clusters. 98% of self-reported Caucasians had an estimated probability 50% of belonging to Cluster 1, while 94% of self-reported African-Americans had an estimated probability 50% of belonging to Cluster 2. Stratifying the participants by self-reported ethnicity and repeating the analyses revealed the presence of two clusters among Caucasians, suggesting that potential substructure may exist. Conclusions. Among our combined sample of African-American and Caucasian participants there is no large unknown subpopulation and self-reported ethnicity can serve as a proxy for genetic ancestry. Ethnicity-specific analyses indicate that population substructure may exist among the Caucasian participants indicating that further studies are warranted.

Original languageEnglish (US)
Article number260
JournalBMC Research Notes
Volume2
DOIs
StatePublished - 2009
Externally publishedYes

Fingerprint

Stroke
Population
Proxy
African Americans
Case-Control Studies
Polymorphism
Nucleotides
Genes
Gene Frequency
Single Nucleotide Polymorphism
Cluster Analysis
Genome

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Mez, J. B., Cole, J. W., Howard, T. D., MacClellan, L. R., Stine, O. C., O'Connell, J. R., ... Kittner, S. J. (2009). Evaluation of self-reported ethnicity in a case-control population: The stroke prevention in young women study. BMC Research Notes, 2, [260]. https://doi.org/10.1186/1756-0500-2-260

Evaluation of self-reported ethnicity in a case-control population : The stroke prevention in young women study. / Mez, Jesse B.; Cole, John W.; Howard, Timothy D.; MacClellan, Leah R.; Stine, Oscar C.; O'Connell, Jeffery R.; Wozniak, Marcella A.; Stern, Barney; Sorkin, John D.; Mitchell, Braxton D.; Kittner, Steven J.

In: BMC Research Notes, Vol. 2, 260, 2009.

Research output: Contribution to journalArticle

Mez, JB, Cole, JW, Howard, TD, MacClellan, LR, Stine, OC, O'Connell, JR, Wozniak, MA, Stern, B, Sorkin, JD, Mitchell, BD & Kittner, SJ 2009, 'Evaluation of self-reported ethnicity in a case-control population: The stroke prevention in young women study', BMC Research Notes, vol. 2, 260. https://doi.org/10.1186/1756-0500-2-260
Mez, Jesse B. ; Cole, John W. ; Howard, Timothy D. ; MacClellan, Leah R. ; Stine, Oscar C. ; O'Connell, Jeffery R. ; Wozniak, Marcella A. ; Stern, Barney ; Sorkin, John D. ; Mitchell, Braxton D. ; Kittner, Steven J. / Evaluation of self-reported ethnicity in a case-control population : The stroke prevention in young women study. In: BMC Research Notes. 2009 ; Vol. 2.
@article{49f055c583b84d43bff019148731a853,
title = "Evaluation of self-reported ethnicity in a case-control population: The stroke prevention in young women study",
abstract = "Background. Population-based association studies are used to identify common susceptibility variants for complex genetic traits. These studies are susceptible to confounding from unknown population substructure. Here we apply a model-based clustering approach to our case-control study of stroke among young women to examine if self-reported ethnicity can serve as a proxy for genetic ancestry. Findings. A population-based case-control study of stroke among women aged 15-49 identified 361 cases of first ischemic stroke and 401 age-comparable control subjects. Thirty single nucleotide polymorphisms (SNPs) throughout the genome unrelated to stroke risk and with established ancestry-based allele frequency differences were genotyped in all participants. The Structure program was used to iteratively evaluate for K = 1 to 5 potential genetic-based subpopulations. Evaluating the population as a whole, the Structure output plateaued at K = 2 clusters. 98{\%} of self-reported Caucasians had an estimated probability 50{\%} of belonging to Cluster 1, while 94{\%} of self-reported African-Americans had an estimated probability 50{\%} of belonging to Cluster 2. Stratifying the participants by self-reported ethnicity and repeating the analyses revealed the presence of two clusters among Caucasians, suggesting that potential substructure may exist. Conclusions. Among our combined sample of African-American and Caucasian participants there is no large unknown subpopulation and self-reported ethnicity can serve as a proxy for genetic ancestry. Ethnicity-specific analyses indicate that population substructure may exist among the Caucasian participants indicating that further studies are warranted.",
author = "Mez, {Jesse B.} and Cole, {John W.} and Howard, {Timothy D.} and MacClellan, {Leah R.} and Stine, {Oscar C.} and O'Connell, {Jeffery R.} and Wozniak, {Marcella A.} and Barney Stern and Sorkin, {John D.} and Mitchell, {Braxton D.} and Kittner, {Steven J.}",
year = "2009",
doi = "10.1186/1756-0500-2-260",
language = "English (US)",
volume = "2",
journal = "BMC Research Notes",
issn = "1756-0500",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Evaluation of self-reported ethnicity in a case-control population

T2 - The stroke prevention in young women study

AU - Mez, Jesse B.

AU - Cole, John W.

AU - Howard, Timothy D.

AU - MacClellan, Leah R.

AU - Stine, Oscar C.

AU - O'Connell, Jeffery R.

AU - Wozniak, Marcella A.

AU - Stern, Barney

AU - Sorkin, John D.

AU - Mitchell, Braxton D.

AU - Kittner, Steven J.

PY - 2009

Y1 - 2009

N2 - Background. Population-based association studies are used to identify common susceptibility variants for complex genetic traits. These studies are susceptible to confounding from unknown population substructure. Here we apply a model-based clustering approach to our case-control study of stroke among young women to examine if self-reported ethnicity can serve as a proxy for genetic ancestry. Findings. A population-based case-control study of stroke among women aged 15-49 identified 361 cases of first ischemic stroke and 401 age-comparable control subjects. Thirty single nucleotide polymorphisms (SNPs) throughout the genome unrelated to stroke risk and with established ancestry-based allele frequency differences were genotyped in all participants. The Structure program was used to iteratively evaluate for K = 1 to 5 potential genetic-based subpopulations. Evaluating the population as a whole, the Structure output plateaued at K = 2 clusters. 98% of self-reported Caucasians had an estimated probability 50% of belonging to Cluster 1, while 94% of self-reported African-Americans had an estimated probability 50% of belonging to Cluster 2. Stratifying the participants by self-reported ethnicity and repeating the analyses revealed the presence of two clusters among Caucasians, suggesting that potential substructure may exist. Conclusions. Among our combined sample of African-American and Caucasian participants there is no large unknown subpopulation and self-reported ethnicity can serve as a proxy for genetic ancestry. Ethnicity-specific analyses indicate that population substructure may exist among the Caucasian participants indicating that further studies are warranted.

AB - Background. Population-based association studies are used to identify common susceptibility variants for complex genetic traits. These studies are susceptible to confounding from unknown population substructure. Here we apply a model-based clustering approach to our case-control study of stroke among young women to examine if self-reported ethnicity can serve as a proxy for genetic ancestry. Findings. A population-based case-control study of stroke among women aged 15-49 identified 361 cases of first ischemic stroke and 401 age-comparable control subjects. Thirty single nucleotide polymorphisms (SNPs) throughout the genome unrelated to stroke risk and with established ancestry-based allele frequency differences were genotyped in all participants. The Structure program was used to iteratively evaluate for K = 1 to 5 potential genetic-based subpopulations. Evaluating the population as a whole, the Structure output plateaued at K = 2 clusters. 98% of self-reported Caucasians had an estimated probability 50% of belonging to Cluster 1, while 94% of self-reported African-Americans had an estimated probability 50% of belonging to Cluster 2. Stratifying the participants by self-reported ethnicity and repeating the analyses revealed the presence of two clusters among Caucasians, suggesting that potential substructure may exist. Conclusions. Among our combined sample of African-American and Caucasian participants there is no large unknown subpopulation and self-reported ethnicity can serve as a proxy for genetic ancestry. Ethnicity-specific analyses indicate that population substructure may exist among the Caucasian participants indicating that further studies are warranted.

UR - http://www.scopus.com/inward/record.url?scp=77049127371&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77049127371&partnerID=8YFLogxK

U2 - 10.1186/1756-0500-2-260

DO - 10.1186/1756-0500-2-260

M3 - Article

VL - 2

JO - BMC Research Notes

JF - BMC Research Notes

SN - 1756-0500

M1 - 260

ER -