Composite kernel machine regression based on likelihood ratio test for joint testing of genetic and gene–environment interaction effect

Ni Zhao, Haoyu Zhang, Jennifer J. Clark, Arnab Maity, Michael C. Wu

Research output: Contribution to journalArticle

Abstract

Most common human diseases are a result from the combined effect of genes, the environmental factors, and their interactions such that including gene–environment (GE) interactions can improve power in gene mapping studies. The standard strategy is to test the SNPs, one-by-one, using a regression model that includes both the SNP effect and the GE interaction. However, the SNP-by-SNP approach has serious limitations, such as the inability to model epistatic SNP effects, biased estimation, and reduced power. Thus, in this article, we develop a kernel machine regression framework to model the overall genetic effect of a SNP-set, considering the possible GE interaction. Specifically, we use a composite kernel to specify the overall genetic effect via a nonparametric function andwe model additional covariates parametrically within the regression framework. The composite kernel is constructed as a weighted average of two kernels, one corresponding to the genetic main effect and one corresponding to the GE interaction effect. We propose a likelihood ratio test (LRT) and a restricted likelihood ratio test (RLRT) for statistical significance. We derive a Monte Carlo approach for the finite sample distributions of LRT and RLRT statistics. Extensive simulations and real data analysis show that our proposed method has correct type I error and can have higher power than score-based approaches under many situations.

Original languageEnglish (US)
JournalBiometrics
DOIs
StatePublished - Jan 1 2018

Fingerprint

Gene-environment Interaction
Kernel Machines
Interaction Effects
Genetic Testing
Likelihood Ratio Test
Single Nucleotide Polymorphism
Joints
Regression
Composite
Testing
Composite materials
seeds
kernel
Genes
testing
Biased Estimation
Gene
Likelihood Ratio Test Statistic
human diseases
Type I error

Keywords

  • gene–environment interactions
  • kernel machine testing
  • likelihood ratio test
  • multiple variance components
  • spectral decomposition
  • unidentifiable conditions

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)
  • Applied Mathematics

Cite this

Composite kernel machine regression based on likelihood ratio test for joint testing of genetic and gene–environment interaction effect. / Zhao, Ni; Zhang, Haoyu; Clark, Jennifer J.; Maity, Arnab; Wu, Michael C.

In: Biometrics, 01.01.2018.

Research output: Contribution to journalArticle

@article{c8c8ad12ded74d42abee443a3af74262,
title = "Composite kernel machine regression based on likelihood ratio test for joint testing of genetic and gene–environment interaction effect",
abstract = "Most common human diseases are a result from the combined effect of genes, the environmental factors, and their interactions such that including gene–environment (GE) interactions can improve power in gene mapping studies. The standard strategy is to test the SNPs, one-by-one, using a regression model that includes both the SNP effect and the GE interaction. However, the SNP-by-SNP approach has serious limitations, such as the inability to model epistatic SNP effects, biased estimation, and reduced power. Thus, in this article, we develop a kernel machine regression framework to model the overall genetic effect of a SNP-set, considering the possible GE interaction. Specifically, we use a composite kernel to specify the overall genetic effect via a nonparametric function andwe model additional covariates parametrically within the regression framework. The composite kernel is constructed as a weighted average of two kernels, one corresponding to the genetic main effect and one corresponding to the GE interaction effect. We propose a likelihood ratio test (LRT) and a restricted likelihood ratio test (RLRT) for statistical significance. We derive a Monte Carlo approach for the finite sample distributions of LRT and RLRT statistics. Extensive simulations and real data analysis show that our proposed method has correct type I error and can have higher power than score-based approaches under many situations.",
keywords = "gene–environment interactions, kernel machine testing, likelihood ratio test, multiple variance components, spectral decomposition, unidentifiable conditions",
author = "Ni Zhao and Haoyu Zhang and Clark, {Jennifer J.} and Arnab Maity and Wu, {Michael C.}",
year = "2018",
month = "1",
day = "1",
doi = "10.1111/biom.13003",
language = "English (US)",
journal = "Biometrics",
issn = "0006-341X",
publisher = "Wiley-Blackwell",

}

TY - JOUR

T1 - Composite kernel machine regression based on likelihood ratio test for joint testing of genetic and gene–environment interaction effect

AU - Zhao, Ni

AU - Zhang, Haoyu

AU - Clark, Jennifer J.

AU - Maity, Arnab

AU - Wu, Michael C.

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Most common human diseases are a result from the combined effect of genes, the environmental factors, and their interactions such that including gene–environment (GE) interactions can improve power in gene mapping studies. The standard strategy is to test the SNPs, one-by-one, using a regression model that includes both the SNP effect and the GE interaction. However, the SNP-by-SNP approach has serious limitations, such as the inability to model epistatic SNP effects, biased estimation, and reduced power. Thus, in this article, we develop a kernel machine regression framework to model the overall genetic effect of a SNP-set, considering the possible GE interaction. Specifically, we use a composite kernel to specify the overall genetic effect via a nonparametric function andwe model additional covariates parametrically within the regression framework. The composite kernel is constructed as a weighted average of two kernels, one corresponding to the genetic main effect and one corresponding to the GE interaction effect. We propose a likelihood ratio test (LRT) and a restricted likelihood ratio test (RLRT) for statistical significance. We derive a Monte Carlo approach for the finite sample distributions of LRT and RLRT statistics. Extensive simulations and real data analysis show that our proposed method has correct type I error and can have higher power than score-based approaches under many situations.

AB - Most common human diseases are a result from the combined effect of genes, the environmental factors, and their interactions such that including gene–environment (GE) interactions can improve power in gene mapping studies. The standard strategy is to test the SNPs, one-by-one, using a regression model that includes both the SNP effect and the GE interaction. However, the SNP-by-SNP approach has serious limitations, such as the inability to model epistatic SNP effects, biased estimation, and reduced power. Thus, in this article, we develop a kernel machine regression framework to model the overall genetic effect of a SNP-set, considering the possible GE interaction. Specifically, we use a composite kernel to specify the overall genetic effect via a nonparametric function andwe model additional covariates parametrically within the regression framework. The composite kernel is constructed as a weighted average of two kernels, one corresponding to the genetic main effect and one corresponding to the GE interaction effect. We propose a likelihood ratio test (LRT) and a restricted likelihood ratio test (RLRT) for statistical significance. We derive a Monte Carlo approach for the finite sample distributions of LRT and RLRT statistics. Extensive simulations and real data analysis show that our proposed method has correct type I error and can have higher power than score-based approaches under many situations.

KW - gene–environment interactions

KW - kernel machine testing

KW - likelihood ratio test

KW - multiple variance components

KW - spectral decomposition

KW - unidentifiable conditions

UR - http://www.scopus.com/inward/record.url?scp=85063616403&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063616403&partnerID=8YFLogxK

U2 - 10.1111/biom.13003

DO - 10.1111/biom.13003

M3 - Article

JO - Biometrics

JF - Biometrics

SN - 0006-341X

ER -