Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling

Michael Aaron Rosenblum, Mark J. Van Der Laan

Research output: Contribution to journalArticle

Abstract

The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).

Original languageEnglish (US)
Article number4
JournalThe international journal of biostatistics
Volume5
Issue number1
DOIs
StatePublished - 2009
Externally publishedYes

Fingerprint

Survey Sampling
Small Sample Size
Sample Size
Confidence interval
Confidence Intervals
Population
Iraq
Sample mean
Central limit theorem
Tail
Surveys and Questionnaires
Sampling
Small sample
Sample size
Bernstein Inequality
Sample Survey
Invasion
Coverage Probability
Mortality

Keywords

  • Bernstein's inequality
  • Central limit theorem
  • Confidence interval
  • Influence curve
  • Normal distribution
  • Survey sampling

ASJC Scopus subject areas

  • Medicine(all)
  • Statistics, Probability and Uncertainty
  • Statistics and Probability

Cite this

@article{1f3df677523047b3a36ee592db88dcf8,
title = "Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling",
abstract = "The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study {"}Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey,{"} by Burnham et al. (2006).",
keywords = "Bernstein's inequality, Central limit theorem, Confidence interval, Influence curve, Normal distribution, Survey sampling",
author = "Rosenblum, {Michael Aaron} and {Van Der Laan}, {Mark J.}",
year = "2009",
doi = "10.2202/1557-4679.1118",
language = "English (US)",
volume = "5",
journal = "International Journal of Biostatistics",
issn = "1557-4679",
publisher = "Berkeley Electronic Press",
number = "1",

}

TY - JOUR

T1 - Confidence intervals for the population mean tailored to small sample sizes, with applications to survey sampling

AU - Rosenblum, Michael Aaron

AU - Van Der Laan, Mark J.

PY - 2009

Y1 - 2009

N2 - The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).

AB - The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes. We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability under much weaker assumptions than are required for standard methods. A drawback of this approach, as we show, is that these confidence intervals are often quite wide. In response to this, we present a method for constructing much narrower confidence intervals, which are better suited for practical applications, and that are still more robust than confidence intervals based on standard methods, when dealing with small sample sizes. We show how to extend our approaches to much more general estimation problems than estimating the sample mean. We describe how these methods can be used to obtain more reliable confidence intervals in survey sampling. As a concrete example, we construct confidence intervals using our methods for the number of violent deaths between March 2003 and July 2006 in Iraq, based on data from the study "Mortality after the 2003 invasion of Iraq: A cross sectional cluster sample survey," by Burnham et al. (2006).

KW - Bernstein's inequality

KW - Central limit theorem

KW - Confidence interval

KW - Influence curve

KW - Normal distribution

KW - Survey sampling

UR - http://www.scopus.com/inward/record.url?scp=62749164866&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=62749164866&partnerID=8YFLogxK

U2 - 10.2202/1557-4679.1118

DO - 10.2202/1557-4679.1118

M3 - Article

C2 - 20231867

AN - SCOPUS:62749164866

VL - 5

JO - International Journal of Biostatistics

JF - International Journal of Biostatistics

SN - 1557-4679

IS - 1

M1 - 4

ER -