Selecting SNPs informative for African, American Indian and European Ancestry: Application to the Family Investigation of Nephropathy and Diabetes (FIND)

Robert C. Williams, Robert C. Elston, Pankaj Kumar, William C. Knowler, Hanna E. Abboud, Sharon Adler, Donald W. Bowden, Jasmin Divers, Barry I. Freedman, Robert P. Igo, Eli Ipp, Sudha K. Iyengar, Paul L. Kimmel, Michael J. Klag, Orly Kohn, Carl D. Langefeld, David J. Leehey, Robert G. Nelson, Susanne B. Nicholas, Madeleine V. PahlRulan S. Parekh, Jerome I. Rotter, Jeffrey R. Schelling, John R. Sedor, Vallabh O. Shah, Michael W. Smith, Kent D. Taylor, Farook Thameem, Denyse Thornley-Brown, Cheryl A. Winkler, Xiuqing Guo, Phillip Zager, Robert L. Hanson, Research Group FIND Research Group, Iyengar S.K. Iyengar, R. C. Elston, K. A B Goddard, J. M. Olson, S. Ialacci, J. Fondran, A. Horvath, R. Igo, G. Jun, K. Kramp, J. Molineros, S. R E Quade, J. R. Sedor, J. Schelling, A. Pickens, L. Humbert, L. Getz-Fradley, S. Adler, E. Ipp, M. Pahl, M. F. Seldin, S. Snyder, J. Tayek, E. Hernandez, J. LaPage, C. Garcia, J. Gonzalez, M. Aguilar, Michael John Klag, R. Parekh, L. Kao, Lucy Ann Meoni, T. Whitehead, J. Chester, W. C. Knowler, R. L. Hanson, R. G. Nelson, J. Wolford, L. Jones, R. Juan, R. Lovelace, C. Luethe, L. M. Phillips, J. Sewemaenewa, I. Sili, B. Waseta, M. F. Saad, S. B. Nicholas, Y. D I Chen, X. Guo, J. Rotter, K. Taylor, M. Budgett, F. Hariri, P. Zager, V. Shah, M. Scavini, A. Bobelu, H. Abboud, N. Arar, R. Duggirala, B. S. Kasinath, F. Thameem, M. Stern, B. I. Freedman, D. W. Bowden, C. D. Langefeld, S. C. Satko, S. S. Rich, S. Warren, S. Viverette, G. Brooks, R. Young, M. Spainhour, C. Winkler, M. W. Smith, M. Thompson, R. Hanson, B. Kessing, D. J. Leehey, G. Barone, D. Thornley-Brown, C. Jefferson, O. F. Kohn, C. S. Brown, J. P. Briggs, P. L. Kimmel, R. Rasooly, Warnock D. Warnock, L. Cardon, R. Chakraborty, G. M. Dunston, T. Hostetter, S. J. O'Brien, J. Rioux, R. Spielman

Research output: Contribution to journalArticle

Abstract

Background: The presence of population structure in a sample may confound the search for important genetic loci associated with disease. Our four samples in the Family Investigation of Nephropathy and Diabetes (FIND), European Americans, Mexican Americans, African Americans, and American Indians are part of a genome- wide association study in which population structure might be particularly important. We therefore decided to study in detail one component of this, individual genetic ancestry (IGA). From SNPs present on the Affymetrix 6.0 Human SNP array, we identified 3 sets of ancestry informative markers (AIMs), each maximized for the information in one the three contrasts among ancestral populations: Europeans (HAPMAP, CEU), Africans (HAPMAP, YRI and LWK), and Native Americans (full heritage Pima Indians). We estimate IGA and present an algorithm for their standard errors, compare IGA to principal components, emphasize the importance of balancing information in the ancestry informative markers (AIMs), and test the association of IGA with diabetic nephropathy in the combined sample. Results: A fixed parental allele maximum likelihood algorithm was applied to the FIND to estimate IGA in four samples: 869 American Indians; 1385 African Americans; 1451 Mexican Americans; and 826 European Americans. When the information in the AIMs is unbalanced, the estimates are incorrect with large error. Individual genetic admixture is highly correlated with principle components for capturing population structure. It takes ~700 SNPs to reduce the average standard error of individual admixture below 0.01. When the samples are combined, the resulting population structure creates associations between IGA and diabetic nephropathy. Conclusions: The identified set of AIMs, which include American Indian parental allele frequencies, may be particularly useful for estimating genetic admixture in populations from the Americas. Failure to balance information in maximum likelihood, poly-ancestry models creates biased estimates of individual admixture with large error. This also occurs when estimating IGA using the Bayesian clustering method as implemented in the program STRUCTURE. Odds ratios for the associations of IGA with disease are consistent with what is known about the incidence and prevalence of diabetic nephropathy in these populations.

Original languageEnglish (US)
Article number325
JournalBMC Genomics
Volume17
Issue number1
DOIs
StatePublished - May 4 2016

Fingerprint

North American Indians
African Americans
Single Nucleotide Polymorphism
Diabetic Nephropathies
Population
Potassium Iodide
Genetic Loci
Bayes Theorem
Genome-Wide Association Study
Gene Frequency
Cluster Analysis
Alleles
Odds Ratio
Incidence

Keywords

  • Diabetic nephropathy
  • Individual genetic ancestry
  • Population structure
  • SNP

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

Selecting SNPs informative for African, American Indian and European Ancestry : Application to the Family Investigation of Nephropathy and Diabetes (FIND). / Williams, Robert C.; Elston, Robert C.; Kumar, Pankaj; Knowler, William C.; Abboud, Hanna E.; Adler, Sharon; Bowden, Donald W.; Divers, Jasmin; Freedman, Barry I.; Igo, Robert P.; Ipp, Eli; Iyengar, Sudha K.; Kimmel, Paul L.; Klag, Michael J.; Kohn, Orly; Langefeld, Carl D.; Leehey, David J.; Nelson, Robert G.; Nicholas, Susanne B.; Pahl, Madeleine V.; Parekh, Rulan S.; Rotter, Jerome I.; Schelling, Jeffrey R.; Sedor, John R.; Shah, Vallabh O.; Smith, Michael W.; Taylor, Kent D.; Thameem, Farook; Thornley-Brown, Denyse; Winkler, Cheryl A.; Guo, Xiuqing; Zager, Phillip; Hanson, Robert L.; FIND Research Group, Research Group; S.K. Iyengar, Iyengar; Elston, R. C.; Goddard, K. A B; Olson, J. M.; Ialacci, S.; Fondran, J.; Horvath, A.; Igo, R.; Jun, G.; Kramp, K.; Molineros, J.; Quade, S. R E; Sedor, J. R.; Schelling, J.; Pickens, A.; Humbert, L.; Getz-Fradley, L.; Adler, S.; Ipp, E.; Pahl, M.; Seldin, M. F.; Snyder, S.; Tayek, J.; Hernandez, E.; LaPage, J.; Garcia, C.; Gonzalez, J.; Aguilar, M.; Klag, Michael John; Parekh, R.; Kao, L.; Meoni, Lucy Ann; Whitehead, T.; Chester, J.; Knowler, W. C.; Hanson, R. L.; Nelson, R. G.; Wolford, J.; Jones, L.; Juan, R.; Lovelace, R.; Luethe, C.; Phillips, L. M.; Sewemaenewa, J.; Sili, I.; Waseta, B.; Saad, M. F.; Nicholas, S. B.; Chen, Y. D I; Guo, X.; Rotter, J.; Taylor, K.; Budgett, M.; Hariri, F.; Zager, P.; Shah, V.; Scavini, M.; Bobelu, A.; Abboud, H.; Arar, N.; Duggirala, R.; Kasinath, B. S.; Thameem, F.; Stern, M.; Freedman, B. I.; Bowden, D. W.; Langefeld, C. D.; Satko, S. C.; Rich, S. S.; Warren, S.; Viverette, S.; Brooks, G.; Young, R.; Spainhour, M.; Winkler, C.; Smith, M. W.; Thompson, M.; Hanson, R.; Kessing, B.; Leehey, D. J.; Barone, G.; Thornley-Brown, D.; Jefferson, C.; Kohn, O. F.; Brown, C. S.; Briggs, J. P.; Kimmel, P. L.; Rasooly, R.; D. Warnock, Warnock; Cardon, L.; Chakraborty, R.; Dunston, G. M.; Hostetter, T.; O'Brien, S. J.; Rioux, J.; Spielman, R.

In: BMC Genomics, Vol. 17, No. 1, 325, 04.05.2016.

Research output: Contribution to journalArticle

Williams, RC, Elston, RC, Kumar, P, Knowler, WC, Abboud, HE, Adler, S, Bowden, DW, Divers, J, Freedman, BI, Igo, RP, Ipp, E, Iyengar, SK, Kimmel, PL, Klag, MJ, Kohn, O, Langefeld, CD, Leehey, DJ, Nelson, RG, Nicholas, SB, Pahl, MV, Parekh, RS, Rotter, JI, Schelling, JR, Sedor, JR, Shah, VO, Smith, MW, Taylor, KD, Thameem, F, Thornley-Brown, D, Winkler, CA, Guo, X, Zager, P, Hanson, RL, FIND Research Group, RG, S.K. Iyengar, I, Elston, RC, Goddard, KAB, Olson, JM, Ialacci, S, Fondran, J, Horvath, A, Igo, R, Jun, G, Kramp, K, Molineros, J, Quade, SRE, Sedor, JR, Schelling, J, Pickens, A, Humbert, L, Getz-Fradley, L, Adler, S, Ipp, E, Pahl, M, Seldin, MF, Snyder, S, Tayek, J, Hernandez, E, LaPage, J, Garcia, C, Gonzalez, J, Aguilar, M, Klag, MJ, Parekh, R, Kao, L, Meoni, LA, Whitehead, T, Chester, J, Knowler, WC, Hanson, RL, Nelson, RG, Wolford, J, Jones, L, Juan, R, Lovelace, R, Luethe, C, Phillips, LM, Sewemaenewa, J, Sili, I, Waseta, B, Saad, MF, Nicholas, SB, Chen, YDI, Guo, X, Rotter, J, Taylor, K, Budgett, M, Hariri, F, Zager, P, Shah, V, Scavini, M, Bobelu, A, Abboud, H, Arar, N, Duggirala, R, Kasinath, BS, Thameem, F, Stern, M, Freedman, BI, Bowden, DW, Langefeld, CD, Satko, SC, Rich, SS, Warren, S, Viverette, S, Brooks, G, Young, R, Spainhour, M, Winkler, C, Smith, MW, Thompson, M, Hanson, R, Kessing, B, Leehey, DJ, Barone, G, Thornley-Brown, D, Jefferson, C, Kohn, OF, Brown, CS, Briggs, JP, Kimmel, PL, Rasooly, R, D. Warnock, W, Cardon, L, Chakraborty, R, Dunston, GM, Hostetter, T, O'Brien, SJ, Rioux, J & Spielman, R 2016, 'Selecting SNPs informative for African, American Indian and European Ancestry: Application to the Family Investigation of Nephropathy and Diabetes (FIND)', BMC Genomics, vol. 17, no. 1, 325. https://doi.org/10.1186/s12864-016-2654-x
Williams, Robert C. ; Elston, Robert C. ; Kumar, Pankaj ; Knowler, William C. ; Abboud, Hanna E. ; Adler, Sharon ; Bowden, Donald W. ; Divers, Jasmin ; Freedman, Barry I. ; Igo, Robert P. ; Ipp, Eli ; Iyengar, Sudha K. ; Kimmel, Paul L. ; Klag, Michael J. ; Kohn, Orly ; Langefeld, Carl D. ; Leehey, David J. ; Nelson, Robert G. ; Nicholas, Susanne B. ; Pahl, Madeleine V. ; Parekh, Rulan S. ; Rotter, Jerome I. ; Schelling, Jeffrey R. ; Sedor, John R. ; Shah, Vallabh O. ; Smith, Michael W. ; Taylor, Kent D. ; Thameem, Farook ; Thornley-Brown, Denyse ; Winkler, Cheryl A. ; Guo, Xiuqing ; Zager, Phillip ; Hanson, Robert L. ; FIND Research Group, Research Group ; S.K. Iyengar, Iyengar ; Elston, R. C. ; Goddard, K. A B ; Olson, J. M. ; Ialacci, S. ; Fondran, J. ; Horvath, A. ; Igo, R. ; Jun, G. ; Kramp, K. ; Molineros, J. ; Quade, S. R E ; Sedor, J. R. ; Schelling, J. ; Pickens, A. ; Humbert, L. ; Getz-Fradley, L. ; Adler, S. ; Ipp, E. ; Pahl, M. ; Seldin, M. F. ; Snyder, S. ; Tayek, J. ; Hernandez, E. ; LaPage, J. ; Garcia, C. ; Gonzalez, J. ; Aguilar, M. ; Klag, Michael John ; Parekh, R. ; Kao, L. ; Meoni, Lucy Ann ; Whitehead, T. ; Chester, J. ; Knowler, W. C. ; Hanson, R. L. ; Nelson, R. G. ; Wolford, J. ; Jones, L. ; Juan, R. ; Lovelace, R. ; Luethe, C. ; Phillips, L. M. ; Sewemaenewa, J. ; Sili, I. ; Waseta, B. ; Saad, M. F. ; Nicholas, S. B. ; Chen, Y. D I ; Guo, X. ; Rotter, J. ; Taylor, K. ; Budgett, M. ; Hariri, F. ; Zager, P. ; Shah, V. ; Scavini, M. ; Bobelu, A. ; Abboud, H. ; Arar, N. ; Duggirala, R. ; Kasinath, B. S. ; Thameem, F. ; Stern, M. ; Freedman, B. I. ; Bowden, D. W. ; Langefeld, C. D. ; Satko, S. C. ; Rich, S. S. ; Warren, S. ; Viverette, S. ; Brooks, G. ; Young, R. ; Spainhour, M. ; Winkler, C. ; Smith, M. W. ; Thompson, M. ; Hanson, R. ; Kessing, B. ; Leehey, D. J. ; Barone, G. ; Thornley-Brown, D. ; Jefferson, C. ; Kohn, O. F. ; Brown, C. S. ; Briggs, J. P. ; Kimmel, P. L. ; Rasooly, R. ; D. Warnock, Warnock ; Cardon, L. ; Chakraborty, R. ; Dunston, G. M. ; Hostetter, T. ; O'Brien, S. J. ; Rioux, J. ; Spielman, R. / Selecting SNPs informative for African, American Indian and European Ancestry : Application to the Family Investigation of Nephropathy and Diabetes (FIND). In: BMC Genomics. 2016 ; Vol. 17, No. 1.
@article{cf41bf0ecd204e87a0c749e39d932bd5,
title = "Selecting SNPs informative for African, American Indian and European Ancestry: Application to the Family Investigation of Nephropathy and Diabetes (FIND)",
abstract = "Background: The presence of population structure in a sample may confound the search for important genetic loci associated with disease. Our four samples in the Family Investigation of Nephropathy and Diabetes (FIND), European Americans, Mexican Americans, African Americans, and American Indians are part of a genome- wide association study in which population structure might be particularly important. We therefore decided to study in detail one component of this, individual genetic ancestry (IGA). From SNPs present on the Affymetrix 6.0 Human SNP array, we identified 3 sets of ancestry informative markers (AIMs), each maximized for the information in one the three contrasts among ancestral populations: Europeans (HAPMAP, CEU), Africans (HAPMAP, YRI and LWK), and Native Americans (full heritage Pima Indians). We estimate IGA and present an algorithm for their standard errors, compare IGA to principal components, emphasize the importance of balancing information in the ancestry informative markers (AIMs), and test the association of IGA with diabetic nephropathy in the combined sample. Results: A fixed parental allele maximum likelihood algorithm was applied to the FIND to estimate IGA in four samples: 869 American Indians; 1385 African Americans; 1451 Mexican Americans; and 826 European Americans. When the information in the AIMs is unbalanced, the estimates are incorrect with large error. Individual genetic admixture is highly correlated with principle components for capturing population structure. It takes ~700 SNPs to reduce the average standard error of individual admixture below 0.01. When the samples are combined, the resulting population structure creates associations between IGA and diabetic nephropathy. Conclusions: The identified set of AIMs, which include American Indian parental allele frequencies, may be particularly useful for estimating genetic admixture in populations from the Americas. Failure to balance information in maximum likelihood, poly-ancestry models creates biased estimates of individual admixture with large error. This also occurs when estimating IGA using the Bayesian clustering method as implemented in the program STRUCTURE. Odds ratios for the associations of IGA with disease are consistent with what is known about the incidence and prevalence of diabetic nephropathy in these populations.",
keywords = "Diabetic nephropathy, Individual genetic ancestry, Population structure, SNP",
author = "Williams, {Robert C.} and Elston, {Robert C.} and Pankaj Kumar and Knowler, {William C.} and Abboud, {Hanna E.} and Sharon Adler and Bowden, {Donald W.} and Jasmin Divers and Freedman, {Barry I.} and Igo, {Robert P.} and Eli Ipp and Iyengar, {Sudha K.} and Kimmel, {Paul L.} and Klag, {Michael J.} and Orly Kohn and Langefeld, {Carl D.} and Leehey, {David J.} and Nelson, {Robert G.} and Nicholas, {Susanne B.} and Pahl, {Madeleine V.} and Parekh, {Rulan S.} and Rotter, {Jerome I.} and Schelling, {Jeffrey R.} and Sedor, {John R.} and Shah, {Vallabh O.} and Smith, {Michael W.} and Taylor, {Kent D.} and Farook Thameem and Denyse Thornley-Brown and Winkler, {Cheryl A.} and Xiuqing Guo and Phillip Zager and Hanson, {Robert L.} and {FIND Research Group}, {Research Group} and {S.K. Iyengar}, Iyengar and Elston, {R. C.} and Goddard, {K. A B} and Olson, {J. M.} and S. Ialacci and J. Fondran and A. Horvath and R. Igo and G. Jun and K. Kramp and J. Molineros and Quade, {S. R E} and Sedor, {J. R.} and J. Schelling and A. Pickens and L. Humbert and L. Getz-Fradley and S. Adler and E. Ipp and M. Pahl and Seldin, {M. F.} and S. Snyder and J. Tayek and E. Hernandez and J. LaPage and C. Garcia and J. Gonzalez and M. Aguilar and Klag, {Michael John} and R. Parekh and L. Kao and Meoni, {Lucy Ann} and T. Whitehead and J. Chester and Knowler, {W. C.} and Hanson, {R. L.} and Nelson, {R. G.} and J. Wolford and L. Jones and R. Juan and R. Lovelace and C. Luethe and Phillips, {L. M.} and J. Sewemaenewa and I. Sili and B. Waseta and Saad, {M. F.} and Nicholas, {S. B.} and Chen, {Y. D I} and X. Guo and J. Rotter and K. Taylor and M. Budgett and F. Hariri and P. Zager and V. Shah and M. Scavini and A. Bobelu and H. Abboud and N. Arar and R. Duggirala and Kasinath, {B. S.} and F. Thameem and M. Stern and Freedman, {B. I.} and Bowden, {D. W.} and Langefeld, {C. D.} and Satko, {S. C.} and Rich, {S. S.} and S. Warren and S. Viverette and G. Brooks and R. Young and M. Spainhour and C. Winkler and Smith, {M. W.} and M. Thompson and R. Hanson and B. Kessing and Leehey, {D. J.} and G. Barone and D. Thornley-Brown and C. Jefferson and Kohn, {O. F.} and Brown, {C. S.} and Briggs, {J. P.} and Kimmel, {P. L.} and R. Rasooly and {D. Warnock}, Warnock and L. Cardon and R. Chakraborty and Dunston, {G. M.} and T. Hostetter and O'Brien, {S. J.} and J. Rioux and R. Spielman",
year = "2016",
month = "5",
day = "4",
doi = "10.1186/s12864-016-2654-x",
language = "English (US)",
volume = "17",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Selecting SNPs informative for African, American Indian and European Ancestry

T2 - Application to the Family Investigation of Nephropathy and Diabetes (FIND)

AU - Williams, Robert C.

AU - Elston, Robert C.

AU - Kumar, Pankaj

AU - Knowler, William C.

AU - Abboud, Hanna E.

AU - Adler, Sharon

AU - Bowden, Donald W.

AU - Divers, Jasmin

AU - Freedman, Barry I.

AU - Igo, Robert P.

AU - Ipp, Eli

AU - Iyengar, Sudha K.

AU - Kimmel, Paul L.

AU - Klag, Michael J.

AU - Kohn, Orly

AU - Langefeld, Carl D.

AU - Leehey, David J.

AU - Nelson, Robert G.

AU - Nicholas, Susanne B.

AU - Pahl, Madeleine V.

AU - Parekh, Rulan S.

AU - Rotter, Jerome I.

AU - Schelling, Jeffrey R.

AU - Sedor, John R.

AU - Shah, Vallabh O.

AU - Smith, Michael W.

AU - Taylor, Kent D.

AU - Thameem, Farook

AU - Thornley-Brown, Denyse

AU - Winkler, Cheryl A.

AU - Guo, Xiuqing

AU - Zager, Phillip

AU - Hanson, Robert L.

AU - FIND Research Group, Research Group

AU - S.K. Iyengar, Iyengar

AU - Elston, R. C.

AU - Goddard, K. A B

AU - Olson, J. M.

AU - Ialacci, S.

AU - Fondran, J.

AU - Horvath, A.

AU - Igo, R.

AU - Jun, G.

AU - Kramp, K.

AU - Molineros, J.

AU - Quade, S. R E

AU - Sedor, J. R.

AU - Schelling, J.

AU - Pickens, A.

AU - Humbert, L.

AU - Getz-Fradley, L.

AU - Adler, S.

AU - Ipp, E.

AU - Pahl, M.

AU - Seldin, M. F.

AU - Snyder, S.

AU - Tayek, J.

AU - Hernandez, E.

AU - LaPage, J.

AU - Garcia, C.

AU - Gonzalez, J.

AU - Aguilar, M.

AU - Klag, Michael John

AU - Parekh, R.

AU - Kao, L.

AU - Meoni, Lucy Ann

AU - Whitehead, T.

AU - Chester, J.

AU - Knowler, W. C.

AU - Hanson, R. L.

AU - Nelson, R. G.

AU - Wolford, J.

AU - Jones, L.

AU - Juan, R.

AU - Lovelace, R.

AU - Luethe, C.

AU - Phillips, L. M.

AU - Sewemaenewa, J.

AU - Sili, I.

AU - Waseta, B.

AU - Saad, M. F.

AU - Nicholas, S. B.

AU - Chen, Y. D I

AU - Guo, X.

AU - Rotter, J.

AU - Taylor, K.

AU - Budgett, M.

AU - Hariri, F.

AU - Zager, P.

AU - Shah, V.

AU - Scavini, M.

AU - Bobelu, A.

AU - Abboud, H.

AU - Arar, N.

AU - Duggirala, R.

AU - Kasinath, B. S.

AU - Thameem, F.

AU - Stern, M.

AU - Freedman, B. I.

AU - Bowden, D. W.

AU - Langefeld, C. D.

AU - Satko, S. C.

AU - Rich, S. S.

AU - Warren, S.

AU - Viverette, S.

AU - Brooks, G.

AU - Young, R.

AU - Spainhour, M.

AU - Winkler, C.

AU - Smith, M. W.

AU - Thompson, M.

AU - Hanson, R.

AU - Kessing, B.

AU - Leehey, D. J.

AU - Barone, G.

AU - Thornley-Brown, D.

AU - Jefferson, C.

AU - Kohn, O. F.

AU - Brown, C. S.

AU - Briggs, J. P.

AU - Kimmel, P. L.

AU - Rasooly, R.

AU - D. Warnock, Warnock

AU - Cardon, L.

AU - Chakraborty, R.

AU - Dunston, G. M.

AU - Hostetter, T.

AU - O'Brien, S. J.

AU - Rioux, J.

AU - Spielman, R.

PY - 2016/5/4

Y1 - 2016/5/4

N2 - Background: The presence of population structure in a sample may confound the search for important genetic loci associated with disease. Our four samples in the Family Investigation of Nephropathy and Diabetes (FIND), European Americans, Mexican Americans, African Americans, and American Indians are part of a genome- wide association study in which population structure might be particularly important. We therefore decided to study in detail one component of this, individual genetic ancestry (IGA). From SNPs present on the Affymetrix 6.0 Human SNP array, we identified 3 sets of ancestry informative markers (AIMs), each maximized for the information in one the three contrasts among ancestral populations: Europeans (HAPMAP, CEU), Africans (HAPMAP, YRI and LWK), and Native Americans (full heritage Pima Indians). We estimate IGA and present an algorithm for their standard errors, compare IGA to principal components, emphasize the importance of balancing information in the ancestry informative markers (AIMs), and test the association of IGA with diabetic nephropathy in the combined sample. Results: A fixed parental allele maximum likelihood algorithm was applied to the FIND to estimate IGA in four samples: 869 American Indians; 1385 African Americans; 1451 Mexican Americans; and 826 European Americans. When the information in the AIMs is unbalanced, the estimates are incorrect with large error. Individual genetic admixture is highly correlated with principle components for capturing population structure. It takes ~700 SNPs to reduce the average standard error of individual admixture below 0.01. When the samples are combined, the resulting population structure creates associations between IGA and diabetic nephropathy. Conclusions: The identified set of AIMs, which include American Indian parental allele frequencies, may be particularly useful for estimating genetic admixture in populations from the Americas. Failure to balance information in maximum likelihood, poly-ancestry models creates biased estimates of individual admixture with large error. This also occurs when estimating IGA using the Bayesian clustering method as implemented in the program STRUCTURE. Odds ratios for the associations of IGA with disease are consistent with what is known about the incidence and prevalence of diabetic nephropathy in these populations.

AB - Background: The presence of population structure in a sample may confound the search for important genetic loci associated with disease. Our four samples in the Family Investigation of Nephropathy and Diabetes (FIND), European Americans, Mexican Americans, African Americans, and American Indians are part of a genome- wide association study in which population structure might be particularly important. We therefore decided to study in detail one component of this, individual genetic ancestry (IGA). From SNPs present on the Affymetrix 6.0 Human SNP array, we identified 3 sets of ancestry informative markers (AIMs), each maximized for the information in one the three contrasts among ancestral populations: Europeans (HAPMAP, CEU), Africans (HAPMAP, YRI and LWK), and Native Americans (full heritage Pima Indians). We estimate IGA and present an algorithm for their standard errors, compare IGA to principal components, emphasize the importance of balancing information in the ancestry informative markers (AIMs), and test the association of IGA with diabetic nephropathy in the combined sample. Results: A fixed parental allele maximum likelihood algorithm was applied to the FIND to estimate IGA in four samples: 869 American Indians; 1385 African Americans; 1451 Mexican Americans; and 826 European Americans. When the information in the AIMs is unbalanced, the estimates are incorrect with large error. Individual genetic admixture is highly correlated with principle components for capturing population structure. It takes ~700 SNPs to reduce the average standard error of individual admixture below 0.01. When the samples are combined, the resulting population structure creates associations between IGA and diabetic nephropathy. Conclusions: The identified set of AIMs, which include American Indian parental allele frequencies, may be particularly useful for estimating genetic admixture in populations from the Americas. Failure to balance information in maximum likelihood, poly-ancestry models creates biased estimates of individual admixture with large error. This also occurs when estimating IGA using the Bayesian clustering method as implemented in the program STRUCTURE. Odds ratios for the associations of IGA with disease are consistent with what is known about the incidence and prevalence of diabetic nephropathy in these populations.

KW - Diabetic nephropathy

KW - Individual genetic ancestry

KW - Population structure

KW - SNP

UR - http://www.scopus.com/inward/record.url?scp=84977651132&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84977651132&partnerID=8YFLogxK

U2 - 10.1186/s12864-016-2654-x

DO - 10.1186/s12864-016-2654-x

M3 - Article

C2 - 27142425

AN - SCOPUS:84977651132

VL - 17

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - 1

M1 - 325

ER -