On assessing model fit for distribution-free longitudinal models under missing data

P. Wu, X. M. Tu, J. Kowalski

Research output: Contribution to journalArticle

Abstract

The generalized estimating equation (GEE), a distribution-free, or semi-parametric, approach for modeling longitudinal data, is used in a wide range of behavioral, psychotherapy, pharmaceutical drug safety, and healthcare-related research studies. Most popular methods for assessing model fit are based on the likelihood function for parametric models, rendering them inappropriate for distribution-free GEE. One rare exception is a score statistic initially proposed by Tsiatis for logistic regression (1980) and later extended by Barnhart and Willamson to GEE (1998). Because GEE only provides valid inference under the missing completely at random assumption and missing values arising in most longitudinal studies do not follow such a restricted mechanism, this GEE-based score test has very limited applications in practice. We propose extensions of this goodness-of-fit test to address missing data under the missing at random assumption, a more realistic model that applies to most studies in practice. We examine the performance of the proposed tests using simulated data and demonstrate the utilities of such tests with data from a real study on geriatric depression and associated medical comorbidities.

Original languageEnglish (US)
Pages (from-to)143-157
Number of pages15
JournalStatistics in Medicine
Volume33
Issue number1
DOIs
StatePublished - Jan 15 2014
Externally publishedYes

Fingerprint

Generalized Estimating Equations
Distribution-free
Missing Data
Likelihood Functions
Health Services Research
Psychotherapy
Geriatrics
Pharmaceutical Preparations
Longitudinal Studies
Comorbidity
Logistic Models
Depression
Safety
Missing Completely at Random
Score Statistic
Missing at Random
Model
Longitudinal Study
Score Test
Missing Values

Keywords

  • Goodness of fit
  • Missing at random
  • Score test
  • Small-sample adjusted score test
  • Weighted generalized estimating equations

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Cite this

On assessing model fit for distribution-free longitudinal models under missing data. / Wu, P.; Tu, X. M.; Kowalski, J.

In: Statistics in Medicine, Vol. 33, No. 1, 15.01.2014, p. 143-157.

Research output: Contribution to journalArticle

Wu, P. ; Tu, X. M. ; Kowalski, J. / On assessing model fit for distribution-free longitudinal models under missing data. In: Statistics in Medicine. 2014 ; Vol. 33, No. 1. pp. 143-157.
@article{c3be01d1f9404118b450742744b73634,
title = "On assessing model fit for distribution-free longitudinal models under missing data",
abstract = "The generalized estimating equation (GEE), a distribution-free, or semi-parametric, approach for modeling longitudinal data, is used in a wide range of behavioral, psychotherapy, pharmaceutical drug safety, and healthcare-related research studies. Most popular methods for assessing model fit are based on the likelihood function for parametric models, rendering them inappropriate for distribution-free GEE. One rare exception is a score statistic initially proposed by Tsiatis for logistic regression (1980) and later extended by Barnhart and Willamson to GEE (1998). Because GEE only provides valid inference under the missing completely at random assumption and missing values arising in most longitudinal studies do not follow such a restricted mechanism, this GEE-based score test has very limited applications in practice. We propose extensions of this goodness-of-fit test to address missing data under the missing at random assumption, a more realistic model that applies to most studies in practice. We examine the performance of the proposed tests using simulated data and demonstrate the utilities of such tests with data from a real study on geriatric depression and associated medical comorbidities.",
keywords = "Goodness of fit, Missing at random, Score test, Small-sample adjusted score test, Weighted generalized estimating equations",
author = "P. Wu and Tu, {X. M.} and J. Kowalski",
year = "2014",
month = "1",
day = "15",
doi = "10.1002/sim.5908",
language = "English (US)",
volume = "33",
pages = "143--157",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "1",

}

TY - JOUR

T1 - On assessing model fit for distribution-free longitudinal models under missing data

AU - Wu, P.

AU - Tu, X. M.

AU - Kowalski, J.

PY - 2014/1/15

Y1 - 2014/1/15

N2 - The generalized estimating equation (GEE), a distribution-free, or semi-parametric, approach for modeling longitudinal data, is used in a wide range of behavioral, psychotherapy, pharmaceutical drug safety, and healthcare-related research studies. Most popular methods for assessing model fit are based on the likelihood function for parametric models, rendering them inappropriate for distribution-free GEE. One rare exception is a score statistic initially proposed by Tsiatis for logistic regression (1980) and later extended by Barnhart and Willamson to GEE (1998). Because GEE only provides valid inference under the missing completely at random assumption and missing values arising in most longitudinal studies do not follow such a restricted mechanism, this GEE-based score test has very limited applications in practice. We propose extensions of this goodness-of-fit test to address missing data under the missing at random assumption, a more realistic model that applies to most studies in practice. We examine the performance of the proposed tests using simulated data and demonstrate the utilities of such tests with data from a real study on geriatric depression and associated medical comorbidities.

AB - The generalized estimating equation (GEE), a distribution-free, or semi-parametric, approach for modeling longitudinal data, is used in a wide range of behavioral, psychotherapy, pharmaceutical drug safety, and healthcare-related research studies. Most popular methods for assessing model fit are based on the likelihood function for parametric models, rendering them inappropriate for distribution-free GEE. One rare exception is a score statistic initially proposed by Tsiatis for logistic regression (1980) and later extended by Barnhart and Willamson to GEE (1998). Because GEE only provides valid inference under the missing completely at random assumption and missing values arising in most longitudinal studies do not follow such a restricted mechanism, this GEE-based score test has very limited applications in practice. We propose extensions of this goodness-of-fit test to address missing data under the missing at random assumption, a more realistic model that applies to most studies in practice. We examine the performance of the proposed tests using simulated data and demonstrate the utilities of such tests with data from a real study on geriatric depression and associated medical comorbidities.

KW - Goodness of fit

KW - Missing at random

KW - Score test

KW - Small-sample adjusted score test

KW - Weighted generalized estimating equations

UR - http://www.scopus.com/inward/record.url?scp=84889634937&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84889634937&partnerID=8YFLogxK

U2 - 10.1002/sim.5908

DO - 10.1002/sim.5908

M3 - Article

VL - 33

SP - 143

EP - 157

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 1

ER -