TY - JOUR
T1 - Estimation of interaction effects using pooled biospecimens in a case-control study
AU - Danaher, Michelle R.
AU - Albert, Paul S.
AU - Roy, Aninyda
AU - Schisterman, Enrique F.
PY - 2015
Y1 - 2015
N2 - Pooling, or physically mixing biospecimens, prior to evaluating biomarkers dramatically reduces biomarker evaluation cost, reduces the quantity of biospecimens required of each individual, and may reduce the percentage of laboratory measurements below the lower limit of detection. Motivated by a case-control study on miscarriage (binary outcome) and cytokines (continuous exposures), we are interested in estimating parameters in a logistic regression, where individuals with the same disease status (with or without a miscarriage) are paired and their pooled cytokine concentrations are assessed. Previous research has proposed a set-based logistic model to evaluate the relationship between a disease and pooled exposures. While the set-based logistic model is very useful for estimating main effects, it cannot estimate interactions of continuous exposures when both are measured in pools. Therefore, we propose using the expectation maximization (EM) algorithm to obtain estimators of all parameters in logistic regression model, including interactions effects. Using a simulation study, we present comparisons of efficiency under different scenarios where exposures have been measured in pools and individually. Our simulations show that randomly sampling half of the available biospecimens has less efficiency than pooling pairs of biospecimens stratified by disease status. The EM algorithm provides a method for estimating interaction effects when biospecimens have already been pooled for other reasons such as the gain in efficiency for estimating main effects demonstrated by previous research. This manuscript demonstrates that the EM algorithm offers a promising approach to estimate interaction effects of pooled biospecimens.
AB - Pooling, or physically mixing biospecimens, prior to evaluating biomarkers dramatically reduces biomarker evaluation cost, reduces the quantity of biospecimens required of each individual, and may reduce the percentage of laboratory measurements below the lower limit of detection. Motivated by a case-control study on miscarriage (binary outcome) and cytokines (continuous exposures), we are interested in estimating parameters in a logistic regression, where individuals with the same disease status (with or without a miscarriage) are paired and their pooled cytokine concentrations are assessed. Previous research has proposed a set-based logistic model to evaluate the relationship between a disease and pooled exposures. While the set-based logistic model is very useful for estimating main effects, it cannot estimate interactions of continuous exposures when both are measured in pools. Therefore, we propose using the expectation maximization (EM) algorithm to obtain estimators of all parameters in logistic regression model, including interactions effects. Using a simulation study, we present comparisons of efficiency under different scenarios where exposures have been measured in pools and individually. Our simulations show that randomly sampling half of the available biospecimens has less efficiency than pooling pairs of biospecimens stratified by disease status. The EM algorithm provides a method for estimating interaction effects when biospecimens have already been pooled for other reasons such as the gain in efficiency for estimating main effects demonstrated by previous research. This manuscript demonstrates that the EM algorithm offers a promising approach to estimate interaction effects of pooled biospecimens.
KW - Cytokines
KW - Expectation maximization
KW - Logistic regression
KW - Pooling designs
KW - Skewed biomarkers
UR - http://www.scopus.com/inward/record.url?scp=84947997986&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84947997986&partnerID=8YFLogxK
U2 - 10.1002/sim.6798
DO - 10.1002/sim.6798
M3 - Article
C2 - 26553532
AN - SCOPUS:84947997986
JO - Statistics in Medicine
JF - Statistics in Medicine
SN - 0277-6715
ER -