TY - JOUR
T1 - Deductive Semiparametric Estimation in Double-Sampling Designs with Application to PEPFAR
AU - Qian, Tianchen
AU - Frangakis, Constantine
AU - Yiannoutsos, Constantin
N1 - Funding Information:
The authors would like to thank Ming-Wen An of Vassar College for providing the code for the estimator in An et al. [ 2 ]. The authors would also like to acknowledge Beverly S. Musick of Indiana University for compiling the database on which this study was based and for offering expert advice on the data. This work was supported by the U.S. National Institute of Drug Abuse (R01 AI102710-01). The statements in this work are solely the responsibility of the authors and do not represent the views of this organization.
Funding Information:
The authors would like to thank Ming-Wen An of Vassar College for providing the code for the estimator in An et al. [2]. The authors would also like to acknowledge Beverly S. Musick of Indiana University for compiling the database on which this study was based and for offering expert advice on the data. This work was supported by the U.S. National Institute of Drug Abuse (R01 AI102710-01). The statements in this work are solely the responsibility of the authors and do not represent the views of this organization.
Publisher Copyright:
© 2019, International Chinese Statistical Association.
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Non-ignorable dropout is common in studies with long follow-up time, and it can bias study results unless handled carefully in the study design and the statistical analysis. A double-sampling design allocates additional resources to pursue a subsample of the dropouts and find out their outcomes, which can address potential biases due to non-ignorable dropout. It is desirable to construct semiparametric estimators for the double-sampling design because of their robustness properties. However, obtaining such semiparametric estimators remains a challenge due to the requirement of the analytic form of the efficient influence function (EIF), the derivation of which can be ad hoc and difficult for the double-sampling design. Recent work has shown how the derivation of EIF can be made deductive and computerizable using the functional derivative representation of the EIF in nonparametric models. This approach, however, requires deriving the mixture of a continuous distribution and a point mass, which can itself be challenging for complicated problems such as the double-sampling design. We propose semiparametric estimators for the survival probability in double-sampling designs by generalizing the deductive and computerizable estimation approach. In particular, we propose to build the semiparametric estimators based on a discretized support structure, which approximates the possibly continuous observed data distribution and circumvents the derivation of the mixture distribution. Our approach is deductive in the sense that it is expected to produce semiparametric locally efficient estimators within finite steps without knowledge of the EIF. We apply the proposed estimators to estimating the mortality rate in a double-sampling design component of the President’s Emergency Plan for AIDS Relief (PEPFAR) program. We evaluate the impact of double-sampling selection criteria on the mortality rate estimates. Simulation studies are conducted to evaluate the robustness of the proposed estimators.
AB - Non-ignorable dropout is common in studies with long follow-up time, and it can bias study results unless handled carefully in the study design and the statistical analysis. A double-sampling design allocates additional resources to pursue a subsample of the dropouts and find out their outcomes, which can address potential biases due to non-ignorable dropout. It is desirable to construct semiparametric estimators for the double-sampling design because of their robustness properties. However, obtaining such semiparametric estimators remains a challenge due to the requirement of the analytic form of the efficient influence function (EIF), the derivation of which can be ad hoc and difficult for the double-sampling design. Recent work has shown how the derivation of EIF can be made deductive and computerizable using the functional derivative representation of the EIF in nonparametric models. This approach, however, requires deriving the mixture of a continuous distribution and a point mass, which can itself be challenging for complicated problems such as the double-sampling design. We propose semiparametric estimators for the survival probability in double-sampling designs by generalizing the deductive and computerizable estimation approach. In particular, we propose to build the semiparametric estimators based on a discretized support structure, which approximates the possibly continuous observed data distribution and circumvents the derivation of the mixture distribution. Our approach is deductive in the sense that it is expected to produce semiparametric locally efficient estimators within finite steps without knowledge of the EIF. We apply the proposed estimators to estimating the mortality rate in a double-sampling design component of the President’s Emergency Plan for AIDS Relief (PEPFAR) program. We evaluate the impact of double-sampling selection criteria on the mortality rate estimates. Simulation studies are conducted to evaluate the robustness of the proposed estimators.
KW - Deductive estimator
KW - Double-sampling design
KW - Missing data
KW - Semiparametric estimator
KW - Survival analysis
KW - Turing-computerization
UR - http://www.scopus.com/inward/record.url?scp=85074729537&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074729537&partnerID=8YFLogxK
U2 - 10.1007/s12561-019-09262-2
DO - 10.1007/s12561-019-09262-2
M3 - Article
AN - SCOPUS:85074729537
SN - 1867-1764
VL - 12
SP - 417
EP - 445
JO - Statistics in Biosciences
JF - Statistics in Biosciences
IS - 3
ER -