Haplotypic structure of the X chromosome in the COGA population sample and the quality of its reconstruction by extant software packages

Fabio Marroni, Chiara Toni, Benedetto Pennato, Ya Yu Tsai, Pryia Duggal, Joan E. Bailey-Wilson, Silvano Presciuttini

Research output: Contribution to journalArticle

Abstract

Background: The haplotypes of the X chromosome are accessible to direct count in males, whereas the diplotypes of the females may be inferred knowing the haplotype of their sons or fathers. Here, we investigated: 1) the possible large-scale haplotypic structure of the X chromosome in a Caucasian population sample, given the single-nucleotide polymorphism (SNP) maps and genotypes provided by Illumina and Affirmetrix for Genetic Analysis Workshop 14, and, 2) the performances of widely used programs in reconstructing haplotypes from population genotypic data, given their known distribution in a sample of unrelated individuals. Results: All possible unrelated mother-son pairs of Caucasian ancestry (N = 104) were selected from the 143 families of the Collaborative Study on the Genetics of Alcoholism pedigree files, and the diplotypes of the mothers were inferred from the X chromosomes of their sons. The marker set included 313 SNPs at an average density of 0.47 Mb. Linkage disequilibrium between pairs of markers was computed by the parameter D′, whereas for measuring multilocus disequilibrium, we developed here an index called D*, and applied it to all possible sliding windows of 5 markers each. Results showed a complex pattern of haplotypic structure, with regions of low linkage disequilibrium separated by regions of high values of D*. The following programs were evaluated for their accuracy in inferring population haplotype frequencies: 1) ARLEQUIN 2.001; 2) PHASE 2.1.1;3) SNPHAP 1.1;4) HAPLOBLOCK 1.2;5) HAPLOTYPER 1.0. Performances were evaluated by Pearson correlation (r) coefficient between the true and the inferred distribution of haplotype frequencies. Conclusion: The SNP haplotypic structure of the X chromosome is complex, with regions of high haplotype conservation interspersed among regions of higher haplotype diversity. All the tested programs were accurate (r = 1) in reconstructing the distribution of haplotype frequencies in case of high D* values. However, only the program PHASE realized a high correlation coefficient (r > 0.7) in conditions of low linkage disequilibrium.

Original languageEnglish (US)
Article numberS77
JournalBMC genetics
Volume6
Issue numberSUPPL.1
DOIs
StatePublished - Dec 30 2005
Externally publishedYes

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Fingerprint Dive into the research topics of 'Haplotypic structure of the X chromosome in the COGA population sample and the quality of its reconstruction by extant software packages'. Together they form a unique fingerprint.

  • Cite this