Logic Regression

Ingo Ruczinski, Charles Kooperberg, Michael Leblanc

Research output: Contribution to journalReview article

Abstract

Logic regression is an adaptive regression methodology that attempts to construct predictors as Boolean combinations of binary covariates. In many regression problems a model is developed that relates the main effects (the predictors or transformations thereof) to the response, while interactions are usually kept simple (two- to three-way interactions at most). Often, especially when all predictors are binary, the interaction between many predictors may be what causes the differences in response. This issue arises, for example, in the analysis of SNP microarray data or in some data mining problems. In the proposed methodology, given a set of binary predictors we create new predictors such as "X1, X2, X3, and X4 are true," or "X5 or X6, but not X7 are true." In more specific terms: we try to fit regression models of the form g(E[Y]) = b0 + b1L1 ++ bnL n, where Lj is any Boolean expression of the predictors. The Lj and bj are estimated simultaneously using a simulated annealing algorithm. This article discusses how to fit logic regression models, how to carry out model selection for these models, and gives some examples.

Original languageEnglish (US)
Pages (from-to)475-511
Number of pages37
JournalJournal of Computational and Graphical Statistics
Volume12
Issue number3
DOIs
StatePublished - Sep 1 2003

Keywords

  • Adaptive model selection
  • Binary variables
  • Boolean logic
  • Interactions
  • Simulated annealing
  • Snp data

ASJC Scopus subject areas

  • Statistics and Probability
  • Discrete Mathematics and Combinatorics
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'Logic Regression'. Together they form a unique fingerprint.

  • Cite this