Defining haplotype blocks and tag single-nucleotide polymorphisms in the human genome

Thomas G. Schulze; Kui Zhang; Yu Sheng Chen; Nirmala Akula; Fengzhu Sun; Francis J. McMahon

doi:10.1093/hmg/ddh035

Defining haplotype blocks and tag single-nucleotide polymorphisms in the human genome

Thomas G. Schulze, Kui Zhang, Yu Sheng Chen, Nirmala Akula, Fengzhu Sun, Francis J. McMahon

Research output: Contribution to journal › Article › peer-review

37 Scopus citations

Abstract

Recent studies suggest that the genome is organized into blocks of haplotypes, and efforts to create a glenome-wide haplotype map of single-nucleotide polymorphisms (SNPs) are already underway. Haplotype blocks are defined algorithmically and to date several algorithms have been proposed. However, little is known about their relative performance in real data or about the impact of allele frequencies and parameter choices on the detection of haplotype blocks and the markers that tag them. Here we present a formal comparison of two major algorithms, a linkage disequilibrium (LD)-based method and a dynamic programming algorithm (DPA), in three chromosomal regions differing in gene content and recombination rate. The two methods produced strikingly different results. DPA identified fewer and larger haplotype blocks as well as a smaller set of tag SNPs than the LD method. For both methods, the results were strongly dependent on the allele frequency. Decreasing the minor allele frequency led to an up to 3.7-fold increase in the number of haplotype blocks and tag SNPs. Definition of haplotype blocks and tag SNPs was also sensitive to parameter changes, but the results could not be reconciled simply by parameter adjustment. These results show that two major methods for detecting haplotype blocks and tag SNPs can produce different results in the same data and that these results are sensitive to marker allele frequencies and parameter choices. More information is needed to guide the choice of method, marker allele frequencies, and parameters in the development of a haplotype map.

Original language	English (US)
Pages (from-to)	335-342
Number of pages	8
Journal	Human molecular genetics
Volume	13
Issue number	3
DOIs	https://doi.org/10.1093/hmg/ddh035
State	Published - Feb 1 2004
Externally published	Yes

ASJC Scopus subject areas

Molecular Biology
Genetics
Genetics(clinical)

Access to Document

10.1093/hmg/ddh035

Cite this

@article{1064e7a6e70e434e8442eec5b9f852c8,

title = "Defining haplotype blocks and tag single-nucleotide polymorphisms in the human genome",

abstract = "Recent studies suggest that the genome is organized into blocks of haplotypes, and efforts to create a glenome-wide haplotype map of single-nucleotide polymorphisms (SNPs) are already underway. Haplotype blocks are defined algorithmically and to date several algorithms have been proposed. However, little is known about their relative performance in real data or about the impact of allele frequencies and parameter choices on the detection of haplotype blocks and the markers that tag them. Here we present a formal comparison of two major algorithms, a linkage disequilibrium (LD)-based method and a dynamic programming algorithm (DPA), in three chromosomal regions differing in gene content and recombination rate. The two methods produced strikingly different results. DPA identified fewer and larger haplotype blocks as well as a smaller set of tag SNPs than the LD method. For both methods, the results were strongly dependent on the allele frequency. Decreasing the minor allele frequency led to an up to 3.7-fold increase in the number of haplotype blocks and tag SNPs. Definition of haplotype blocks and tag SNPs was also sensitive to parameter changes, but the results could not be reconciled simply by parameter adjustment. These results show that two major methods for detecting haplotype blocks and tag SNPs can produce different results in the same data and that these results are sensitive to marker allele frequencies and parameter choices. More information is needed to guide the choice of method, marker allele frequencies, and parameters in the development of a haplotype map.",

author = "Schulze, {Thomas G.} and Kui Zhang and Chen, {Yu Sheng} and Nirmala Akula and Fengzhu Sun and McMahon, {Francis J.}",

note = "Funding Information: Supported by grants from the National Institute of Mental Health, the Edward F. Mallinckrodt Jr Foundation, the Chicago Brain Research Institute, and the National Alliance for Research on Schizophrenia and Depression (Young Investigators Awards to T.G.S. and Y.S.C.). K.Z. and F.S. were supported by a grant from the National Institutes of Health (NIH P50 HG 002790). We gratefully acknowledge help from Gonc¸alo Abecasis in obtaining the chromosome 22 genotypes from The Wellcome Trust Sanger Institute.",

year = "2004",

month = feb,

day = "1",

doi = "10.1093/hmg/ddh035",

language = "English (US)",

volume = "13",

pages = "335--342",

journal = "Human molecular genetics",

issn = "0964-6906",

publisher = "Oxford University Press",

number = "3",

}

TY - JOUR

T1 - Defining haplotype blocks and tag single-nucleotide polymorphisms in the human genome

AU - Schulze, Thomas G.

AU - Zhang, Kui

AU - Chen, Yu Sheng

AU - Akula, Nirmala

AU - Sun, Fengzhu

AU - McMahon, Francis J.

N1 - Funding Information: Supported by grants from the National Institute of Mental Health, the Edward F. Mallinckrodt Jr Foundation, the Chicago Brain Research Institute, and the National Alliance for Research on Schizophrenia and Depression (Young Investigators Awards to T.G.S. and Y.S.C.). K.Z. and F.S. were supported by a grant from the National Institutes of Health (NIH P50 HG 002790). We gratefully acknowledge help from Gonc¸alo Abecasis in obtaining the chromosome 22 genotypes from The Wellcome Trust Sanger Institute.

PY - 2004/2/1

Y1 - 2004/2/1

N2 - Recent studies suggest that the genome is organized into blocks of haplotypes, and efforts to create a glenome-wide haplotype map of single-nucleotide polymorphisms (SNPs) are already underway. Haplotype blocks are defined algorithmically and to date several algorithms have been proposed. However, little is known about their relative performance in real data or about the impact of allele frequencies and parameter choices on the detection of haplotype blocks and the markers that tag them. Here we present a formal comparison of two major algorithms, a linkage disequilibrium (LD)-based method and a dynamic programming algorithm (DPA), in three chromosomal regions differing in gene content and recombination rate. The two methods produced strikingly different results. DPA identified fewer and larger haplotype blocks as well as a smaller set of tag SNPs than the LD method. For both methods, the results were strongly dependent on the allele frequency. Decreasing the minor allele frequency led to an up to 3.7-fold increase in the number of haplotype blocks and tag SNPs. Definition of haplotype blocks and tag SNPs was also sensitive to parameter changes, but the results could not be reconciled simply by parameter adjustment. These results show that two major methods for detecting haplotype blocks and tag SNPs can produce different results in the same data and that these results are sensitive to marker allele frequencies and parameter choices. More information is needed to guide the choice of method, marker allele frequencies, and parameters in the development of a haplotype map.

AB - Recent studies suggest that the genome is organized into blocks of haplotypes, and efforts to create a glenome-wide haplotype map of single-nucleotide polymorphisms (SNPs) are already underway. Haplotype blocks are defined algorithmically and to date several algorithms have been proposed. However, little is known about their relative performance in real data or about the impact of allele frequencies and parameter choices on the detection of haplotype blocks and the markers that tag them. Here we present a formal comparison of two major algorithms, a linkage disequilibrium (LD)-based method and a dynamic programming algorithm (DPA), in three chromosomal regions differing in gene content and recombination rate. The two methods produced strikingly different results. DPA identified fewer and larger haplotype blocks as well as a smaller set of tag SNPs than the LD method. For both methods, the results were strongly dependent on the allele frequency. Decreasing the minor allele frequency led to an up to 3.7-fold increase in the number of haplotype blocks and tag SNPs. Definition of haplotype blocks and tag SNPs was also sensitive to parameter changes, but the results could not be reconciled simply by parameter adjustment. These results show that two major methods for detecting haplotype blocks and tag SNPs can produce different results in the same data and that these results are sensitive to marker allele frequencies and parameter choices. More information is needed to guide the choice of method, marker allele frequencies, and parameters in the development of a haplotype map.

UR - http://www.scopus.com/inward/record.url?scp=1042280308&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1042280308&partnerID=8YFLogxK

U2 - 10.1093/hmg/ddh035

DO - 10.1093/hmg/ddh035

M3 - Article

C2 - 14681300

AN - SCOPUS:1042280308

SN - 0964-6906

VL - 13

SP - 335

EP - 342

JO - Human molecular genetics

JF - Human molecular genetics

IS - 3

ER -

Defining haplotype blocks and tag single-nucleotide polymorphisms in the human genome

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this