TY - JOUR
T1 - Genetic admixture and population substructure in Guanacaste Costa Rica
AU - Wang, Zhaoming
AU - Hildesheim, Allan
AU - Wang, Sophia S.
AU - Herrero, Rolando
AU - Gonzalez, Paula
AU - Burdette, Laurie
AU - Hutchinson, Amy
AU - Thomas, Gilles
AU - Chanock, Stephen J.
AU - Yu, Kai
N1 - Funding Information:
Some authors are affilated with the SAIC Frederick, Inc, with federal funds from the National Cancer Institute, National Institutes of Health, under Contract No. HHSN261200800001E. The SAIC-Frederick is specifically and solely a government contractor for the NCI. There is no competing interest.
PY - 2010
Y1 - 2010
N2 - The population of Costa Rica (CR) represents an admixture of major continental populations. An investigation of the CR population structure would provide an important foundation for mapping genetic variants underlying common diseases and traits. We conducted an analysis of 1,301 women from the Guanacaste region of CR using 27,904 single nucleotide polymorphisms (SNPs) genotyped on a custom Illumina InfiniumII iSelect chip. The program STRUCTURE was used to compare the CR Guanacaste sample with four continental reference samples, including HapMap Europeans (CEU), East Asians (JPT+CHB), West African Yoruba (YRI), as well as Native Americans (NA) from the Illumina iControl database. Our results show that the CR Guanacaste sample comprises a three-way admixture estimated to be 43% European, 38% Native American and 15% West African. An estimated 4% residual Asian ancestry may be within the error range. Results from principal components analysis reveal a correlation between genetic and geographic distance. The magnitude of linkage disequilibrium (LD) measured by the number of tagging SNPs required to cover the same region in the genome in the CR Guanacaste sample appeared to be weaker than that observed in CEU, JPT+CHB and NA reference samples but stronger than that of the HapMap YRI sample. Based on the clustering pattern observed in both STRUCTURE and principal components analysis, two subpopulations were identified that differ by approximately 20% in LD block size averaged over all LD blocks identified by Haploview. We also show in a simulated association study conducted within the two subpopulations, that the failure to account for population stratification (PS) could lead to a noticeable inflation in the false positive rate. However, we further demonstrate that existing PS adjustment approaches can reduce the inflation to an acceptable level for gene discovery.
AB - The population of Costa Rica (CR) represents an admixture of major continental populations. An investigation of the CR population structure would provide an important foundation for mapping genetic variants underlying common diseases and traits. We conducted an analysis of 1,301 women from the Guanacaste region of CR using 27,904 single nucleotide polymorphisms (SNPs) genotyped on a custom Illumina InfiniumII iSelect chip. The program STRUCTURE was used to compare the CR Guanacaste sample with four continental reference samples, including HapMap Europeans (CEU), East Asians (JPT+CHB), West African Yoruba (YRI), as well as Native Americans (NA) from the Illumina iControl database. Our results show that the CR Guanacaste sample comprises a three-way admixture estimated to be 43% European, 38% Native American and 15% West African. An estimated 4% residual Asian ancestry may be within the error range. Results from principal components analysis reveal a correlation between genetic and geographic distance. The magnitude of linkage disequilibrium (LD) measured by the number of tagging SNPs required to cover the same region in the genome in the CR Guanacaste sample appeared to be weaker than that observed in CEU, JPT+CHB and NA reference samples but stronger than that of the HapMap YRI sample. Based on the clustering pattern observed in both STRUCTURE and principal components analysis, two subpopulations were identified that differ by approximately 20% in LD block size averaged over all LD blocks identified by Haploview. We also show in a simulated association study conducted within the two subpopulations, that the failure to account for population stratification (PS) could lead to a noticeable inflation in the false positive rate. However, we further demonstrate that existing PS adjustment approaches can reduce the inflation to an acceptable level for gene discovery.
UR - http://www.scopus.com/inward/record.url?scp=78149438449&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78149438449&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0013336
DO - 10.1371/journal.pone.0013336
M3 - Article
C2 - 20967209
AN - SCOPUS:78149438449
VL - 5
JO - PLoS One
JF - PLoS One
SN - 1932-6203
IS - 10
M1 - e13336
ER -