A Bayesian shrinkage model for incomplete longitudinal binary data with application to the Breast Cancer Prevention Trial

Chenguang Wang, M. J. Daniels, Daniel O Scharfstein, S. Land

Research output: Contribution to journalArticle

Abstract

We consider inference in randomized longitudinal studies with missing data that is generated by skipped clinic visits and loss to followup. In this setting, it is well known that full data estimands are not identified unless unverified assumptions are imposed. We assume a nonfuture dependence model for the drop-out mechanism and partial ignorability for the intermittent missingness. We posit an exponential tilt model that links nonidentifiable distributions and distributions identified under partial ignorability. This exponential tilt model is indexed by nonidentified parameters, which are assumed to have an informative prior distribution, elicited from subject-matter experts. Under this model, full data estimands are shown to be expressed as functionals of the distribution of the observed data. To avoid the curse of dimensionality, we model the distribution of the observed data using a Bayesian shrinkage model. In a simulation study, we compare our approach to a fully parametric and a fully saturated model for the distribution of the observed data. Our methodology is motivated by, and applied to, data from the Breast Cancer Prevention Trial.

Original languageEnglish (US)
Pages (from-to)1333-1346
Number of pages14
JournalJournal of the American Statistical Association
Volume105
Issue number492
DOIs
StatePublished - Dec 2010

Fingerprint

Binary Data
Longitudinal Data
Shrinkage
Breast Cancer
Ignorability
Tilt
Model
Partial
Curse of Dimensionality
Drop out
Longitudinal Study
Prior distribution
Breast cancer
Missing Data
Simulation Study
Methodology

Keywords

  • Informative drop-out
  • Intermittent missingness
  • Prior elicitation

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

@article{b946e6c506614feea664cad914cd8ee4,
title = "A Bayesian shrinkage model for incomplete longitudinal binary data with application to the Breast Cancer Prevention Trial",
abstract = "We consider inference in randomized longitudinal studies with missing data that is generated by skipped clinic visits and loss to followup. In this setting, it is well known that full data estimands are not identified unless unverified assumptions are imposed. We assume a nonfuture dependence model for the drop-out mechanism and partial ignorability for the intermittent missingness. We posit an exponential tilt model that links nonidentifiable distributions and distributions identified under partial ignorability. This exponential tilt model is indexed by nonidentified parameters, which are assumed to have an informative prior distribution, elicited from subject-matter experts. Under this model, full data estimands are shown to be expressed as functionals of the distribution of the observed data. To avoid the curse of dimensionality, we model the distribution of the observed data using a Bayesian shrinkage model. In a simulation study, we compare our approach to a fully parametric and a fully saturated model for the distribution of the observed data. Our methodology is motivated by, and applied to, data from the Breast Cancer Prevention Trial.",
keywords = "Informative drop-out, Intermittent missingness, Prior elicitation",
author = "Chenguang Wang and Daniels, {M. J.} and Scharfstein, {Daniel O} and S. Land",
year = "2010",
month = "12",
doi = "10.1198/jasa.2010.ap09321",
language = "English (US)",
volume = "105",
pages = "1333--1346",
journal = "Journal of the American Statistical Association",
issn = "0162-1459",
publisher = "Taylor and Francis Ltd.",
number = "492",

}

TY - JOUR

T1 - A Bayesian shrinkage model for incomplete longitudinal binary data with application to the Breast Cancer Prevention Trial

AU - Wang, Chenguang

AU - Daniels, M. J.

AU - Scharfstein, Daniel O

AU - Land, S.

PY - 2010/12

Y1 - 2010/12

N2 - We consider inference in randomized longitudinal studies with missing data that is generated by skipped clinic visits and loss to followup. In this setting, it is well known that full data estimands are not identified unless unverified assumptions are imposed. We assume a nonfuture dependence model for the drop-out mechanism and partial ignorability for the intermittent missingness. We posit an exponential tilt model that links nonidentifiable distributions and distributions identified under partial ignorability. This exponential tilt model is indexed by nonidentified parameters, which are assumed to have an informative prior distribution, elicited from subject-matter experts. Under this model, full data estimands are shown to be expressed as functionals of the distribution of the observed data. To avoid the curse of dimensionality, we model the distribution of the observed data using a Bayesian shrinkage model. In a simulation study, we compare our approach to a fully parametric and a fully saturated model for the distribution of the observed data. Our methodology is motivated by, and applied to, data from the Breast Cancer Prevention Trial.

AB - We consider inference in randomized longitudinal studies with missing data that is generated by skipped clinic visits and loss to followup. In this setting, it is well known that full data estimands are not identified unless unverified assumptions are imposed. We assume a nonfuture dependence model for the drop-out mechanism and partial ignorability for the intermittent missingness. We posit an exponential tilt model that links nonidentifiable distributions and distributions identified under partial ignorability. This exponential tilt model is indexed by nonidentified parameters, which are assumed to have an informative prior distribution, elicited from subject-matter experts. Under this model, full data estimands are shown to be expressed as functionals of the distribution of the observed data. To avoid the curse of dimensionality, we model the distribution of the observed data using a Bayesian shrinkage model. In a simulation study, we compare our approach to a fully parametric and a fully saturated model for the distribution of the observed data. Our methodology is motivated by, and applied to, data from the Breast Cancer Prevention Trial.

KW - Informative drop-out

KW - Intermittent missingness

KW - Prior elicitation

UR - http://www.scopus.com/inward/record.url?scp=78651285375&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78651285375&partnerID=8YFLogxK

U2 - 10.1198/jasa.2010.ap09321

DO - 10.1198/jasa.2010.ap09321

M3 - Article

C2 - 21516191

AN - SCOPUS:78651285375

VL - 105

SP - 1333

EP - 1346

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 492

ER -