Genetic admixture and population substructure in Guanacaste Costa Rica

Zhaoming Wang, Allan Hildesheim, Sophia S. Wang, Rolando Herrero, Paula Gonzalez, Laurie Burdette, Amy Hutchinson, Gilles Thomas, Stephen J. Chanock, Kai Yu

Research output: Contribution to journalArticle

Abstract

The population of Costa Rica (CR) represents an admixture of major continental populations. An investigation of the CR population structure would provide an important foundation for mapping genetic variants underlying common diseases and traits. We conducted an analysis of 1,301 women from the Guanacaste region of CR using 27,904 single nucleotide polymorphisms (SNPs) genotyped on a custom Illumina InfiniumII iSelect chip. The program STRUCTURE was used to compare the CR Guanacaste sample with four continental reference samples, including HapMap Europeans (CEU), East Asians (JPT+CHB), West African Yoruba (YRI), as well as Native Americans (NA) from the Illumina iControl database. Our results show that the CR Guanacaste sample comprises a three-way admixture estimated to be 43% European, 38% Native American and 15% West African. An estimated 4% residual Asian ancestry may be within the error range. Results from principal components analysis reveal a correlation between genetic and geographic distance. The magnitude of linkage disequilibrium (LD) measured by the number of tagging SNPs required to cover the same region in the genome in the CR Guanacaste sample appeared to be weaker than that observed in CEU, JPT+CHB and NA reference samples but stronger than that of the HapMap YRI sample. Based on the clustering pattern observed in both STRUCTURE and principal components analysis, two subpopulations were identified that differ by approximately 20% in LD block size averaged over all LD blocks identified by Haploview. We also show in a simulated association study conducted within the two subpopulations, that the failure to account for population stratification (PS) could lead to a noticeable inflation in the false positive rate. However, we further demonstrate that existing PS adjustment approaches can reduce the inflation to an acceptable level for gene discovery.

Original languageEnglish (US)
Article numbere13336
JournalPLoS One
Volume5
Issue number10
DOIs
StatePublished - 2010
Externally publishedYes

Fingerprint

Costa Rica
Population Genetics
Polymorphism
Principal component analysis
Nucleotides
Genes
North American Indians
American Indians
Linkage Disequilibrium
linkage disequilibrium
HapMap Project
Population
inflation
Economic Inflation
Principal Component Analysis
sampling
single nucleotide polymorphism
Single Nucleotide Polymorphism
principal component analysis
Genetic Association Studies

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

Wang, Z., Hildesheim, A., Wang, S. S., Herrero, R., Gonzalez, P., Burdette, L., ... Yu, K. (2010). Genetic admixture and population substructure in Guanacaste Costa Rica. PLoS One, 5(10), [e13336]. https://doi.org/10.1371/journal.pone.0013336

Genetic admixture and population substructure in Guanacaste Costa Rica. / Wang, Zhaoming; Hildesheim, Allan; Wang, Sophia S.; Herrero, Rolando; Gonzalez, Paula; Burdette, Laurie; Hutchinson, Amy; Thomas, Gilles; Chanock, Stephen J.; Yu, Kai.

In: PLoS One, Vol. 5, No. 10, e13336, 2010.

Research output: Contribution to journalArticle

Wang, Z, Hildesheim, A, Wang, SS, Herrero, R, Gonzalez, P, Burdette, L, Hutchinson, A, Thomas, G, Chanock, SJ & Yu, K 2010, 'Genetic admixture and population substructure in Guanacaste Costa Rica', PLoS One, vol. 5, no. 10, e13336. https://doi.org/10.1371/journal.pone.0013336
Wang Z, Hildesheim A, Wang SS, Herrero R, Gonzalez P, Burdette L et al. Genetic admixture and population substructure in Guanacaste Costa Rica. PLoS One. 2010;5(10). e13336. https://doi.org/10.1371/journal.pone.0013336
Wang, Zhaoming ; Hildesheim, Allan ; Wang, Sophia S. ; Herrero, Rolando ; Gonzalez, Paula ; Burdette, Laurie ; Hutchinson, Amy ; Thomas, Gilles ; Chanock, Stephen J. ; Yu, Kai. / Genetic admixture and population substructure in Guanacaste Costa Rica. In: PLoS One. 2010 ; Vol. 5, No. 10.
@article{29d41b97306e4f1a8194962d601d7f7d,
title = "Genetic admixture and population substructure in Guanacaste Costa Rica",
abstract = "The population of Costa Rica (CR) represents an admixture of major continental populations. An investigation of the CR population structure would provide an important foundation for mapping genetic variants underlying common diseases and traits. We conducted an analysis of 1,301 women from the Guanacaste region of CR using 27,904 single nucleotide polymorphisms (SNPs) genotyped on a custom Illumina InfiniumII iSelect chip. The program STRUCTURE was used to compare the CR Guanacaste sample with four continental reference samples, including HapMap Europeans (CEU), East Asians (JPT+CHB), West African Yoruba (YRI), as well as Native Americans (NA) from the Illumina iControl database. Our results show that the CR Guanacaste sample comprises a three-way admixture estimated to be 43{\%} European, 38{\%} Native American and 15{\%} West African. An estimated 4{\%} residual Asian ancestry may be within the error range. Results from principal components analysis reveal a correlation between genetic and geographic distance. The magnitude of linkage disequilibrium (LD) measured by the number of tagging SNPs required to cover the same region in the genome in the CR Guanacaste sample appeared to be weaker than that observed in CEU, JPT+CHB and NA reference samples but stronger than that of the HapMap YRI sample. Based on the clustering pattern observed in both STRUCTURE and principal components analysis, two subpopulations were identified that differ by approximately 20{\%} in LD block size averaged over all LD blocks identified by Haploview. We also show in a simulated association study conducted within the two subpopulations, that the failure to account for population stratification (PS) could lead to a noticeable inflation in the false positive rate. However, we further demonstrate that existing PS adjustment approaches can reduce the inflation to an acceptable level for gene discovery.",
author = "Zhaoming Wang and Allan Hildesheim and Wang, {Sophia S.} and Rolando Herrero and Paula Gonzalez and Laurie Burdette and Amy Hutchinson and Gilles Thomas and Chanock, {Stephen J.} and Kai Yu",
year = "2010",
doi = "10.1371/journal.pone.0013336",
language = "English (US)",
volume = "5",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "10",

}

TY - JOUR

T1 - Genetic admixture and population substructure in Guanacaste Costa Rica

AU - Wang, Zhaoming

AU - Hildesheim, Allan

AU - Wang, Sophia S.

AU - Herrero, Rolando

AU - Gonzalez, Paula

AU - Burdette, Laurie

AU - Hutchinson, Amy

AU - Thomas, Gilles

AU - Chanock, Stephen J.

AU - Yu, Kai

PY - 2010

Y1 - 2010

N2 - The population of Costa Rica (CR) represents an admixture of major continental populations. An investigation of the CR population structure would provide an important foundation for mapping genetic variants underlying common diseases and traits. We conducted an analysis of 1,301 women from the Guanacaste region of CR using 27,904 single nucleotide polymorphisms (SNPs) genotyped on a custom Illumina InfiniumII iSelect chip. The program STRUCTURE was used to compare the CR Guanacaste sample with four continental reference samples, including HapMap Europeans (CEU), East Asians (JPT+CHB), West African Yoruba (YRI), as well as Native Americans (NA) from the Illumina iControl database. Our results show that the CR Guanacaste sample comprises a three-way admixture estimated to be 43% European, 38% Native American and 15% West African. An estimated 4% residual Asian ancestry may be within the error range. Results from principal components analysis reveal a correlation between genetic and geographic distance. The magnitude of linkage disequilibrium (LD) measured by the number of tagging SNPs required to cover the same region in the genome in the CR Guanacaste sample appeared to be weaker than that observed in CEU, JPT+CHB and NA reference samples but stronger than that of the HapMap YRI sample. Based on the clustering pattern observed in both STRUCTURE and principal components analysis, two subpopulations were identified that differ by approximately 20% in LD block size averaged over all LD blocks identified by Haploview. We also show in a simulated association study conducted within the two subpopulations, that the failure to account for population stratification (PS) could lead to a noticeable inflation in the false positive rate. However, we further demonstrate that existing PS adjustment approaches can reduce the inflation to an acceptable level for gene discovery.

AB - The population of Costa Rica (CR) represents an admixture of major continental populations. An investigation of the CR population structure would provide an important foundation for mapping genetic variants underlying common diseases and traits. We conducted an analysis of 1,301 women from the Guanacaste region of CR using 27,904 single nucleotide polymorphisms (SNPs) genotyped on a custom Illumina InfiniumII iSelect chip. The program STRUCTURE was used to compare the CR Guanacaste sample with four continental reference samples, including HapMap Europeans (CEU), East Asians (JPT+CHB), West African Yoruba (YRI), as well as Native Americans (NA) from the Illumina iControl database. Our results show that the CR Guanacaste sample comprises a three-way admixture estimated to be 43% European, 38% Native American and 15% West African. An estimated 4% residual Asian ancestry may be within the error range. Results from principal components analysis reveal a correlation between genetic and geographic distance. The magnitude of linkage disequilibrium (LD) measured by the number of tagging SNPs required to cover the same region in the genome in the CR Guanacaste sample appeared to be weaker than that observed in CEU, JPT+CHB and NA reference samples but stronger than that of the HapMap YRI sample. Based on the clustering pattern observed in both STRUCTURE and principal components analysis, two subpopulations were identified that differ by approximately 20% in LD block size averaged over all LD blocks identified by Haploview. We also show in a simulated association study conducted within the two subpopulations, that the failure to account for population stratification (PS) could lead to a noticeable inflation in the false positive rate. However, we further demonstrate that existing PS adjustment approaches can reduce the inflation to an acceptable level for gene discovery.

UR - http://www.scopus.com/inward/record.url?scp=78149438449&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78149438449&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0013336

DO - 10.1371/journal.pone.0013336

M3 - Article

C2 - 20967209

AN - SCOPUS:78149438449

VL - 5

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 10

M1 - e13336

ER -