Statistical tests, P values, confidence intervals, and power

a guide to misinterpretations

Sander Greenland, Stephen J. Senn, Kenneth J. Rothman, John B. Carlin, Charles Poole, Steven N. Goodman, Douglas G. Altman

Research output: Contribution to journalArticle

Abstract

Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting.

Original languageEnglish (US)
Pages (from-to)337-350
Number of pages14
JournalEuropean Journal of Epidemiology
Volume31
Issue number4
DOIs
StatePublished - Apr 1 2016
Externally publishedYes

Fingerprint

Confidence Intervals
Literature
Research Personnel
Guidelines

Keywords

  • Confidence intervals
  • Hypothesis testing
  • Null testing
  • P value
  • Power
  • Significance tests
  • Statistical testing

ASJC Scopus subject areas

  • Epidemiology

Cite this

Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. European Journal of Epidemiology, 31(4), 337-350. https://doi.org/10.1007/s10654-016-0149-3

Statistical tests, P values, confidence intervals, and power : a guide to misinterpretations. / Greenland, Sander; Senn, Stephen J.; Rothman, Kenneth J.; Carlin, John B.; Poole, Charles; Goodman, Steven N.; Altman, Douglas G.

In: European Journal of Epidemiology, Vol. 31, No. 4, 01.04.2016, p. 337-350.

Research output: Contribution to journalArticle

Greenland, S, Senn, SJ, Rothman, KJ, Carlin, JB, Poole, C, Goodman, SN & Altman, DG 2016, 'Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations', European Journal of Epidemiology, vol. 31, no. 4, pp. 337-350. https://doi.org/10.1007/s10654-016-0149-3
Greenland, Sander ; Senn, Stephen J. ; Rothman, Kenneth J. ; Carlin, John B. ; Poole, Charles ; Goodman, Steven N. ; Altman, Douglas G. / Statistical tests, P values, confidence intervals, and power : a guide to misinterpretations. In: European Journal of Epidemiology. 2016 ; Vol. 31, No. 4. pp. 337-350.
@article{a47b2c795d464b94a9755e0dc38964c0,
title = "Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations",
abstract = "Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting.",
keywords = "Confidence intervals, Hypothesis testing, Null testing, P value, Power, Significance tests, Statistical testing",
author = "Sander Greenland and Senn, {Stephen J.} and Rothman, {Kenneth J.} and Carlin, {John B.} and Charles Poole and Goodman, {Steven N.} and Altman, {Douglas G.}",
year = "2016",
month = "4",
day = "1",
doi = "10.1007/s10654-016-0149-3",
language = "English (US)",
volume = "31",
pages = "337--350",
journal = "European Journal of Epidemiology",
issn = "0393-2990",
publisher = "Springer Netherlands",
number = "4",

}

TY - JOUR

T1 - Statistical tests, P values, confidence intervals, and power

T2 - a guide to misinterpretations

AU - Greenland, Sander

AU - Senn, Stephen J.

AU - Rothman, Kenneth J.

AU - Carlin, John B.

AU - Poole, Charles

AU - Goodman, Steven N.

AU - Altman, Douglas G.

PY - 2016/4/1

Y1 - 2016/4/1

N2 - Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting.

AB - Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting.

KW - Confidence intervals

KW - Hypothesis testing

KW - Null testing

KW - P value

KW - Power

KW - Significance tests

KW - Statistical testing

UR - http://www.scopus.com/inward/record.url?scp=84971612122&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84971612122&partnerID=8YFLogxK

U2 - 10.1007/s10654-016-0149-3

DO - 10.1007/s10654-016-0149-3

M3 - Article

VL - 31

SP - 337

EP - 350

JO - European Journal of Epidemiology

JF - European Journal of Epidemiology

SN - 0393-2990

IS - 4

ER -