Importance measures for epistatic interactions in case-parent trios

Holger Schwender, Katherine Bowers, Daniele Daniele Fallin, Ingo Ruczinski

Research output: Contribution to journalArticle

Abstract

Ensemble methods (such as Bagging and Random Forests) take advantage of unstable base learners (such as decision trees) to improve predictions, and offer measures of variable importance useful for variable selection. LogicFS has been proposed as such an ensemble learner for case-control studies when interactions of single nucleotide polymorphisms (SNPs) are of particular interest. LogicFS uses bootstrap samples of the data and employs the Boolean trees derived in logic regression as base learners to create ensembles of models that allow for the quantification of the contributions of epistatic interactions to the disease risk. In this article, we propose an extension of logicFS suitable for case-parent trio data, and derive an additional importance measure that is much less influenced by linkage disequilibrium between SNPs than the measure originally used in logicFS. We illustrate the performance of the novel procedure in simulation studies and in a case study of 461 case-parent trios with autistic children.

Original languageEnglish (US)
Pages (from-to)122-132
Number of pages11
JournalAnnals of Human Genetics
Volume75
Issue number1
DOIs
StatePublished - Jan 2011

Fingerprint

Single Nucleotide Polymorphism
Decision Trees
Linkage Disequilibrium
Case-Control Studies
Forests

Keywords

  • Autism
  • Epistatic interaction
  • Family based association study
  • Gene-gene interaction
  • LogicFS
  • Trio logic regression

ASJC Scopus subject areas

  • Genetics(clinical)
  • Genetics

Cite this

Importance measures for epistatic interactions in case-parent trios. / Schwender, Holger; Bowers, Katherine; Fallin, Daniele Daniele; Ruczinski, Ingo.

In: Annals of Human Genetics, Vol. 75, No. 1, 01.2011, p. 122-132.

Research output: Contribution to journalArticle

@article{b4798636337d4e42b9d09c1e2b2ae553,
title = "Importance measures for epistatic interactions in case-parent trios",
abstract = "Ensemble methods (such as Bagging and Random Forests) take advantage of unstable base learners (such as decision trees) to improve predictions, and offer measures of variable importance useful for variable selection. LogicFS has been proposed as such an ensemble learner for case-control studies when interactions of single nucleotide polymorphisms (SNPs) are of particular interest. LogicFS uses bootstrap samples of the data and employs the Boolean trees derived in logic regression as base learners to create ensembles of models that allow for the quantification of the contributions of epistatic interactions to the disease risk. In this article, we propose an extension of logicFS suitable for case-parent trio data, and derive an additional importance measure that is much less influenced by linkage disequilibrium between SNPs than the measure originally used in logicFS. We illustrate the performance of the novel procedure in simulation studies and in a case study of 461 case-parent trios with autistic children.",
keywords = "Autism, Epistatic interaction, Family based association study, Gene-gene interaction, LogicFS, Trio logic regression",
author = "Holger Schwender and Katherine Bowers and Fallin, {Daniele Daniele} and Ingo Ruczinski",
year = "2011",
month = "1",
doi = "10.1111/j.1469-1809.2010.00623.x",
language = "English (US)",
volume = "75",
pages = "122--132",
journal = "Annals of Human Genetics",
issn = "0003-4800",
publisher = "Wiley-Blackwell",
number = "1",

}

TY - JOUR

T1 - Importance measures for epistatic interactions in case-parent trios

AU - Schwender, Holger

AU - Bowers, Katherine

AU - Fallin, Daniele Daniele

AU - Ruczinski, Ingo

PY - 2011/1

Y1 - 2011/1

N2 - Ensemble methods (such as Bagging and Random Forests) take advantage of unstable base learners (such as decision trees) to improve predictions, and offer measures of variable importance useful for variable selection. LogicFS has been proposed as such an ensemble learner for case-control studies when interactions of single nucleotide polymorphisms (SNPs) are of particular interest. LogicFS uses bootstrap samples of the data and employs the Boolean trees derived in logic regression as base learners to create ensembles of models that allow for the quantification of the contributions of epistatic interactions to the disease risk. In this article, we propose an extension of logicFS suitable for case-parent trio data, and derive an additional importance measure that is much less influenced by linkage disequilibrium between SNPs than the measure originally used in logicFS. We illustrate the performance of the novel procedure in simulation studies and in a case study of 461 case-parent trios with autistic children.

AB - Ensemble methods (such as Bagging and Random Forests) take advantage of unstable base learners (such as decision trees) to improve predictions, and offer measures of variable importance useful for variable selection. LogicFS has been proposed as such an ensemble learner for case-control studies when interactions of single nucleotide polymorphisms (SNPs) are of particular interest. LogicFS uses bootstrap samples of the data and employs the Boolean trees derived in logic regression as base learners to create ensembles of models that allow for the quantification of the contributions of epistatic interactions to the disease risk. In this article, we propose an extension of logicFS suitable for case-parent trio data, and derive an additional importance measure that is much less influenced by linkage disequilibrium between SNPs than the measure originally used in logicFS. We illustrate the performance of the novel procedure in simulation studies and in a case study of 461 case-parent trios with autistic children.

KW - Autism

KW - Epistatic interaction

KW - Family based association study

KW - Gene-gene interaction

KW - LogicFS

KW - Trio logic regression

UR - http://www.scopus.com/inward/record.url?scp=78650134503&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78650134503&partnerID=8YFLogxK

U2 - 10.1111/j.1469-1809.2010.00623.x

DO - 10.1111/j.1469-1809.2010.00623.x

M3 - Article

C2 - 21118192

AN - SCOPUS:78650134503

VL - 75

SP - 122

EP - 132

JO - Annals of Human Genetics

JF - Annals of Human Genetics

SN - 0003-4800

IS - 1

ER -