Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling

Michael A. Rosenblum; Mark J. Van Der Laan

doi:10.2202/1557-4679.1118

Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling

Michael A. Rosenblum, Mark J. Van Der Laan

Research output: Contribution to journal › Article › peer-review

9 Scopus citations

Abstract

The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).

Original language	English (US)
Article number	4
Journal	International Journal of Biostatistics
Volume	5
Issue number	1
DOIs	https://doi.org/10.2202/1557-4679.1118
State	Published - 2009
Externally published	Yes

Keywords

Bernstein's inequality
Central limit theorem
Confidence interval
Influence curve
Normal distribution
Survey sampling

ASJC Scopus subject areas

Statistics and Probability
Statistics, Probability and Uncertainty

Access to Document

10.2202/1557-4679.1118

Cite this

@article{1f3df677523047b3a36ee592db88dcf8,

title = "Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling",

abstract = "The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study {"}Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey,{"} by Burnham et al. (2006).",

keywords = "Bernstein's inequality, Central limit theorem, Confidence interval, Influence curve, Normal distribution, Survey sampling",

author = "Rosenblum, {Michael A.} and {Van Der Laan}, {Mark J.}",

note = "Funding Information: KEYWORDS: Bernstein's inequality, central limit theorem, confidence interval, influence curve, normal distribution, survey sampling Author Notes: Michael Rosenblum was supported by a Ruth L. Kirschstein National Research Service Award (NRSA) under NIH/NIMH grant 5 T32 MH-19105-19. Mark van der Laan was supported by NIH grant R01 A1074345-01.",

year = "2009",

doi = "10.2202/1557-4679.1118",

language = "English (US)",

volume = "5",

journal = "International Journal of Biostatistics",

issn = "1557-4679",

publisher = "Berkeley Electronic Press",

number = "1",

}

TY - JOUR

T1 - Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling

AU - Rosenblum, Michael A.

AU - Van Der Laan, Mark J.

N1 - Funding Information: KEYWORDS: Bernstein's inequality, central limit theorem, confidence interval, influence curve, normal distribution, survey sampling Author Notes: Michael Rosenblum was supported by a Ruth L. Kirschstein National Research Service Award (NRSA) under NIH/NIMH grant 5 T32 MH-19105-19. Mark van der Laan was supported by NIH grant R01 A1074345-01.

PY - 2009

Y1 - 2009

N2 - The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).

AB - The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).

KW - Bernstein's inequality

KW - Central limit theorem

KW - Confidence interval

KW - Influence curve

KW - Normal distribution

KW - Survey sampling

UR - http://www.scopus.com/inward/record.url?scp=62749164866&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=62749164866&partnerID=8YFLogxK

U2 - 10.2202/1557-4679.1118

DO - 10.2202/1557-4679.1118

M3 - Article

C2 - 20231867

AN - SCOPUS:62749164866

SN - 1557-4679

VL - 5

JO - International Journal of Biostatistics

JF - International Journal of Biostatistics

IS - 1

M1 - 4

ER -

Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this