Abstract
Genetic epidemiologic studies often collect genotype data at multiple loci within a genomic region of interest from a sample of unrelated individuals. One popular method for analyzing such data is to assess whether haplotypes, i.e., the arrangements of alleles along individual chromosomes, are associated with the disease phenotype or not. For many study subjects, however, the exact haplotype configuration on the pair of homologous chromosomes cannot be derived with certainty from the available locus-specific genotype data (phase ambiguity). In this article, we consider estimating haplotype-specific association parameters in the Cox proportional hazards model, using genotype, environmental exposure, and the disease endpoint data collected from cohort or nested case-control studies. We study alternative Expectation-Maximization algorithms for estimating haplotype frequencies from cohort and nested case-control studies. Based on a hazard function of the disease derived from the observed genotype data, we then propose a semiparametric method for joint estimation of relative-risk parameters and the cumulative baseline hazard function. The method is greatly simplified under a rare disease assumption, for which an asymptotic variance estimator is also proposed. The performance of the proposed estimators is assessed via simulation studies. An application of the proposed method is presented, using data from the Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study.
Original language | English (US) |
---|---|
Pages (from-to) | 28-35 |
Number of pages | 8 |
Journal | Biometrics |
Volume | 62 |
Issue number | 1 |
DOIs | |
State | Published - Mar 2006 |
Externally published | Yes |
Keywords
- Cohort study
- Cox proportional hazards model
- Nested case-control study
- Unphased genotype data
ASJC Scopus subject areas
- Statistics and Probability
- General Biochemistry, Genetics and Molecular Biology
- General Immunology and Microbiology
- General Agricultural and Biological Sciences
- Applied Mathematics