### Abstract

Pooling, or physically mixing biospecimens, prior to evaluating biomarkers dramatically reduces biomarker evaluation cost, reduces the quantity of biospecimens required of each individual, and may reduce the percentage of laboratory measurements below the lower limit of detection. Motivated by a case-control study on miscarriage (binary outcome) and cytokines (continuous exposures), we are interested in estimating parameters in a logistic regression, where individuals with the same disease status (with or without a miscarriage) are paired and their pooled cytokine concentrations are assessed. Previous research has proposed a set-based logistic model to evaluate the relationship between a disease and pooled exposures. While the set-based logistic model is very useful for estimating main effects, it cannot estimate interactions of continuous exposures when both are measured in pools. Therefore, we propose using the expectation maximization (EM) algorithm to obtain estimators of all parameters in logistic regression model, including interactions effects. Using a simulation study, we present comparisons of efficiency under different scenarios where exposures have been measured in pools and individually. Our simulations show that randomly sampling half of the available biospecimens has less efficiency than pooling pairs of biospecimens stratified by disease status. The EM algorithm provides a method for estimating interaction effects when biospecimens have already been pooled for other reasons such as the gain in efficiency for estimating main effects demonstrated by previous research. This manuscript demonstrates that the EM algorithm offers a promising approach to estimate interaction effects of pooled biospecimens.

Original language | English (US) |
---|---|

Journal | Statistics in Medicine |

DOIs | |

State | Accepted/In press - 2015 |

Externally published | Yes |

### Fingerprint

### Keywords

- Cytokines
- Expectation maximization
- Logistic regression
- Pooling designs
- Skewed biomarkers

### ASJC Scopus subject areas

- Epidemiology
- Statistics and Probability

### Cite this

*Statistics in Medicine*. https://doi.org/10.1002/sim.6798

**Estimation of interaction effects using pooled biospecimens in a case-control study.** / Danaher, Michelle R.; Albert, Paul S.; Roy, Aninyda; Schisterman, Enrique F.

Research output: Contribution to journal › Article

*Statistics in Medicine*. https://doi.org/10.1002/sim.6798

}

TY - JOUR

T1 - Estimation of interaction effects using pooled biospecimens in a case-control study

AU - Danaher, Michelle R.

AU - Albert, Paul S.

AU - Roy, Aninyda

AU - Schisterman, Enrique F.

PY - 2015

Y1 - 2015

N2 - Pooling, or physically mixing biospecimens, prior to evaluating biomarkers dramatically reduces biomarker evaluation cost, reduces the quantity of biospecimens required of each individual, and may reduce the percentage of laboratory measurements below the lower limit of detection. Motivated by a case-control study on miscarriage (binary outcome) and cytokines (continuous exposures), we are interested in estimating parameters in a logistic regression, where individuals with the same disease status (with or without a miscarriage) are paired and their pooled cytokine concentrations are assessed. Previous research has proposed a set-based logistic model to evaluate the relationship between a disease and pooled exposures. While the set-based logistic model is very useful for estimating main effects, it cannot estimate interactions of continuous exposures when both are measured in pools. Therefore, we propose using the expectation maximization (EM) algorithm to obtain estimators of all parameters in logistic regression model, including interactions effects. Using a simulation study, we present comparisons of efficiency under different scenarios where exposures have been measured in pools and individually. Our simulations show that randomly sampling half of the available biospecimens has less efficiency than pooling pairs of biospecimens stratified by disease status. The EM algorithm provides a method for estimating interaction effects when biospecimens have already been pooled for other reasons such as the gain in efficiency for estimating main effects demonstrated by previous research. This manuscript demonstrates that the EM algorithm offers a promising approach to estimate interaction effects of pooled biospecimens.

AB - Pooling, or physically mixing biospecimens, prior to evaluating biomarkers dramatically reduces biomarker evaluation cost, reduces the quantity of biospecimens required of each individual, and may reduce the percentage of laboratory measurements below the lower limit of detection. Motivated by a case-control study on miscarriage (binary outcome) and cytokines (continuous exposures), we are interested in estimating parameters in a logistic regression, where individuals with the same disease status (with or without a miscarriage) are paired and their pooled cytokine concentrations are assessed. Previous research has proposed a set-based logistic model to evaluate the relationship between a disease and pooled exposures. While the set-based logistic model is very useful for estimating main effects, it cannot estimate interactions of continuous exposures when both are measured in pools. Therefore, we propose using the expectation maximization (EM) algorithm to obtain estimators of all parameters in logistic regression model, including interactions effects. Using a simulation study, we present comparisons of efficiency under different scenarios where exposures have been measured in pools and individually. Our simulations show that randomly sampling half of the available biospecimens has less efficiency than pooling pairs of biospecimens stratified by disease status. The EM algorithm provides a method for estimating interaction effects when biospecimens have already been pooled for other reasons such as the gain in efficiency for estimating main effects demonstrated by previous research. This manuscript demonstrates that the EM algorithm offers a promising approach to estimate interaction effects of pooled biospecimens.

KW - Cytokines

KW - Expectation maximization

KW - Logistic regression

KW - Pooling designs

KW - Skewed biomarkers

UR - http://www.scopus.com/inward/record.url?scp=84947997986&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84947997986&partnerID=8YFLogxK

U2 - 10.1002/sim.6798

DO - 10.1002/sim.6798

M3 - Article

C2 - 26553532

AN - SCOPUS:84947997986

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

ER -