TY - JOUR
T1 - Modelling multivariate binary data with alternating logistic regressions
AU - Carey, Vincent
AU - Zeger, Scott L.
AU - Diggle, Peter
N1 - Funding Information:
The authors thank Karen Bandeen-Roche and John Hart, Jr., for helpful discussions, and John Hart, Jr., and Barry Gordon for permission to use the Wada test data. We also thank the referees and associate editor for suggestions which have improved upon an earlier draft. Scott L. Zeger gratefully acknowledges support from the US National Institutes of Health grant AI 25529 and from the Merck, Sharp and Dohme Research Laboratory.
PY - 1993/9
Y1 - 1993/9
N2 - SUMMARY: Marginal models for multivariate binary data permit separate modelling of the relationship of the response with explanatory variables, and the association between pairs of responses. When the former is the scientific focus, a first-order generalized estimating equation method (Liang & Zeger, 1986) is easy to implement and gives efficient estimates of regression coefficients, although estimates of the association among the binary outcomes can be inefficient. When the association model is a focus, simultaneous modelling of the responses and all pairwise products (Prentice, 1988) using second-order estimating equations gives more efficient estimates of association parameters as well. However, this procedure can become computationally infeasible as the cluster size gets large. This paper proposes an alternative approach, alternating logistic regressions, for simultaneously regressing the response on explanatory variables as well as modelling the association among responses in terms of pairwise odds ratios. This algorithm iterates between a logistic regression using first-order generalized estimating equations to estimate regression coefficients and a logistic regression of each response on others from the same cluster using an appropriate offset to update the odds ratio parameters. For clusters of size n, alternating logistic regression involves evaluation and inversion of matrices of order n2 rather than n4 as required for second-order generalized estimating equations. The alternating logistic regression estimates are shown to be reasonably efficient relative to solutions of second-order equations in a few problems. The new method is illustrated with an analysis of neuropsychological tests on patients with epileptic seizures.
AB - SUMMARY: Marginal models for multivariate binary data permit separate modelling of the relationship of the response with explanatory variables, and the association between pairs of responses. When the former is the scientific focus, a first-order generalized estimating equation method (Liang & Zeger, 1986) is easy to implement and gives efficient estimates of regression coefficients, although estimates of the association among the binary outcomes can be inefficient. When the association model is a focus, simultaneous modelling of the responses and all pairwise products (Prentice, 1988) using second-order estimating equations gives more efficient estimates of association parameters as well. However, this procedure can become computationally infeasible as the cluster size gets large. This paper proposes an alternative approach, alternating logistic regressions, for simultaneously regressing the response on explanatory variables as well as modelling the association among responses in terms of pairwise odds ratios. This algorithm iterates between a logistic regression using first-order generalized estimating equations to estimate regression coefficients and a logistic regression of each response on others from the same cluster using an appropriate offset to update the odds ratio parameters. For clusters of size n, alternating logistic regression involves evaluation and inversion of matrices of order n2 rather than n4 as required for second-order generalized estimating equations. The alternating logistic regression estimates are shown to be reasonably efficient relative to solutions of second-order equations in a few problems. The new method is illustrated with an analysis of neuropsychological tests on patients with epileptic seizures.
KW - Clustered data
KW - Generalized estimating equation
KW - Logistic regression
UR - http://www.scopus.com/inward/record.url?scp=0000429149&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0000429149&partnerID=8YFLogxK
U2 - 10.1093/biomet/80.3.517
DO - 10.1093/biomet/80.3.517
M3 - Article
AN - SCOPUS:0000429149
SN - 0006-3444
VL - 80
SP - 517
EP - 526
JO - Biometrika
JF - Biometrika
IS - 3
ER -