On assessing model fit for distribution-free longitudinal models under missing data

P. Wu, X. M. Tu, J. Kowalski

Research output: Contribution to journalArticlepeer-review


The generalized estimating equation (GEE), a distribution-free, or semi-parametric, approach for modeling longitudinal data, is used in a wide range of behavioral, psychotherapy, pharmaceutical drug safety, and healthcare-related research studies. Most popular methods for assessing model fit are based on the likelihood function for parametric models, rendering them inappropriate for distribution-free GEE. One rare exception is a score statistic initially proposed by Tsiatis for logistic regression (1980) and later extended by Barnhart and Willamson to GEE (1998). Because GEE only provides valid inference under the missing completely at random assumption and missing values arising in most longitudinal studies do not follow such a restricted mechanism, this GEE-based score test has very limited applications in practice. We propose extensions of this goodness-of-fit test to address missing data under the missing at random assumption, a more realistic model that applies to most studies in practice. We examine the performance of the proposed tests using simulated data and demonstrate the utilities of such tests with data from a real study on geriatric depression and associated medical comorbidities.

Original languageEnglish (US)
Pages (from-to)143-157
Number of pages15
JournalStatistics in Medicine
Issue number1
StatePublished - Jan 15 2014


  • Goodness of fit
  • Missing at random
  • Score test
  • Small-sample adjusted score test
  • Weighted generalized estimating equations

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability


Dive into the research topics of 'On assessing model fit for distribution-free longitudinal models under missing data'. Together they form a unique fingerprint.

Cite this