TY - JOUR
T1 - Effect of non-normality and low count variants on cross-phenotype association tests in GWAS
AU - Ray, Debashree
AU - Chatterjee, Nilanjan
N1 - Funding Information:
Acknowledgements This research was supported in part by the NIH for the Environmental influences of Child Health Outcomes Data Analysis Center (U24OD023382). It was carried out using computing cluster—the Joint High Performance Computing Exchange—at the Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health. We thank Dr Michael Boehnke and Dr Markku Laakso for kindly providing us access to individual level phenotype data on the METSIM amino acid traits. We are grateful to the reviewers for their constructive feedback that immensely helped us improve this article. Finally, DR is thankful to Dr Matthew Stephens for a stimulating conversation on multivariate analyses in GWAS a few years back that sowed the seeds for this article.
Publisher Copyright:
© 2019, The Author(s), under exclusive licence to European Society of Human Genetics.
PY - 2020/3/1
Y1 - 2020/3/1
N2 - Many complex human diseases, such as type 2 diabetes, are characterized by multiple underlying traits/phenotypes that have substantially shared genetic architecture. Multivariate analysis of correlated traits has the potential to increase the power of detecting underlying common genetic loci. Several cross-phenotype association methods have been proposed—some require individual-level data on traits and genotypes, while the others require only summary-level data. In this article, we explore whether non-normality of multivariate trait distribution affects the inference from some of the existing multi-trait methods and how that effect is dependent on the allele count of the genetic variant being tested. We find that most of these tests are susceptible to biases that lead to spurious association signals. Even after controlling for confounders that may contribute to non-normality and then applying inverse normal transformation on the residuals of each trait, these tests may have inflated type I errors for variants with low minor allele counts (MACs). A likelihood ratio test of association based on the ordinal regression of individual-level genotype conditional on the traits seems to be the least biased and can maintain type I error when the MAC is reasonably large (e.g., MAC > 30). Application of these methods to publicly available summary statistics of eight amino acid traits on European samples seem to exhibit systematic inflation (especially for variants with low MAC), which is consistent with our findings from simulation experiments.
AB - Many complex human diseases, such as type 2 diabetes, are characterized by multiple underlying traits/phenotypes that have substantially shared genetic architecture. Multivariate analysis of correlated traits has the potential to increase the power of detecting underlying common genetic loci. Several cross-phenotype association methods have been proposed—some require individual-level data on traits and genotypes, while the others require only summary-level data. In this article, we explore whether non-normality of multivariate trait distribution affects the inference from some of the existing multi-trait methods and how that effect is dependent on the allele count of the genetic variant being tested. We find that most of these tests are susceptible to biases that lead to spurious association signals. Even after controlling for confounders that may contribute to non-normality and then applying inverse normal transformation on the residuals of each trait, these tests may have inflated type I errors for variants with low minor allele counts (MACs). A likelihood ratio test of association based on the ordinal regression of individual-level genotype conditional on the traits seems to be the least biased and can maintain type I error when the MAC is reasonably large (e.g., MAC > 30). Application of these methods to publicly available summary statistics of eight amino acid traits on European samples seem to exhibit systematic inflation (especially for variants with low MAC), which is consistent with our findings from simulation experiments.
UR - http://www.scopus.com/inward/record.url?scp=85074339292&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074339292&partnerID=8YFLogxK
U2 - 10.1038/s41431-019-0514-2
DO - 10.1038/s41431-019-0514-2
M3 - Article
C2 - 31582815
AN - SCOPUS:85074339292
SN - 1018-4813
VL - 28
SP - 300
EP - 312
JO - European Journal of Human Genetics
JF - European Journal of Human Genetics
IS - 3
ER -