TY - JOUR
T1 - Sequence variations in the public human genome data reflect a bottlenecked population history
AU - Marth, Gabor
AU - Schuler, Greg
AU - Yeh, Raymond
AU - Davenport, Ruth
AU - Agarwala, Richa
AU - Church, Deanna
AU - Wheelan, Sarah
AU - Baker, Jonathan
AU - Ward, Ming
AU - Kholodov, Michael
AU - Phan, Lon
AU - Czabarka, Eva
AU - Murvai, Janos
AU - Cutler, David
AU - Wooding, Stephen
AU - Rogers, Alan
AU - Chakravarti, Aravinda
AU - Harpending, Henry C.
AU - Kwok, Pui Yan
AU - Sherry, Stephen T.
PY - 2003/1/7
Y1 - 2003/1/7
N2 - Single-nucleotide polymorphisms (SNPs) constitute the great majority of variations in the human genome, and as heritable variable landmarks they are useful markers for disease mapping and resolving population structure. Redundant coverage in overlaps of large-insert genomic clones, sequenced as part of the Human Genome Project, comprises a quarter of the genome, and it is representative in terms of base compositional and functional sequence features. We mined these regions to produce 500,000 high-confidence SNP candidates as a uniform resource for describing nucleotide diversity and its regional variation within the genome. Distributions of marker density observed at different overlap length scales under a model of recombination and population size change show that the history of the population represented by the public genome sequence is one of collapse followed by a recent phase of mild size recovery. The inferred times of collapse and recovery are Upper Paleolithic, in agreement with archaeological evidence of the initial modern human colonization of Europe.
AB - Single-nucleotide polymorphisms (SNPs) constitute the great majority of variations in the human genome, and as heritable variable landmarks they are useful markers for disease mapping and resolving population structure. Redundant coverage in overlaps of large-insert genomic clones, sequenced as part of the Human Genome Project, comprises a quarter of the genome, and it is representative in terms of base compositional and functional sequence features. We mined these regions to produce 500,000 high-confidence SNP candidates as a uniform resource for describing nucleotide diversity and its regional variation within the genome. Distributions of marker density observed at different overlap length scales under a model of recombination and population size change show that the history of the population represented by the public genome sequence is one of collapse followed by a recent phase of mild size recovery. The inferred times of collapse and recovery are Upper Paleolithic, in agreement with archaeological evidence of the initial modern human colonization of Europe.
UR - http://www.scopus.com/inward/record.url?scp=0037422542&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0037422542&partnerID=8YFLogxK
U2 - 10.1073/pnas.222673099
DO - 10.1073/pnas.222673099
M3 - Article
C2 - 12502794
AN - SCOPUS:0037422542
SN - 0027-8424
VL - 100
SP - 376
EP - 381
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 1
ER -