Toward evidence-based medical statistics. 1: The P value fallacy

Steven N. Goodman

Toward evidence-based medical statistics. 1: The P value fallacy

Steven N. Goodman

School of Medicine

Research output: Contribution to journal › Article › peer-review

795 Scopus citations

Abstract

An important problem exists in the interpretation of modern medical research data: Biological understanding and previous research play little formal role in the interpretation of quantitative results. This phenomenon is manifest in the discussion sections of research articles and ultimately can affect the reliability of conclusions. The standard statistical approach has created this situation by promoting the illusion that conclusions can be produced with certain 'error rates,' without consideration of information from outside the experiment. This statistical approach, the key components of which are P values and hypothesis tests, is widely perceived as a mathematically coherent approach to inference. There is little appreciation in the medical community that the methodology is an amalgam of incompatible elements, whose utility for scientific inference has been the subject of intense debate among statisticians for almost 70 years. This article introduces some of the key elements of that debate and traces the appeal and adverse impact of this methodology to the P value fallacy, the mistaken idea that a single number can capture both the long-run outcomes of an experiment and the evidential meaning of a single result. This argument is made as a prelude to the suggestion that another measure of evidence should be used - the Bayes factor, which properly separates issues of long-run behavior from evidential strength and allows the integration of background knowledge with statistical findings.

Original language	English (US)
Pages (from-to)	995-1004
Number of pages	10
Journal	Annals of Internal Medicine
Volume	130
Issue number	12
State	Published - Jun 15 1999

ASJC Scopus subject areas

General Medicine

Cite this

@article{a9fe1f19facd417bb44e2a1c33aa19de,

title = "Toward evidence-based medical statistics. 1: The P value fallacy",

abstract = "An important problem exists in the interpretation of modern medical research data: Biological understanding and previous research play little formal role in the interpretation of quantitative results. This phenomenon is manifest in the discussion sections of research articles and ultimately can affect the reliability of conclusions. The standard statistical approach has created this situation by promoting the illusion that conclusions can be produced with certain 'error rates,' without consideration of information from outside the experiment. This statistical approach, the key components of which are P values and hypothesis tests, is widely perceived as a mathematically coherent approach to inference. There is little appreciation in the medical community that the methodology is an amalgam of incompatible elements, whose utility for scientific inference has been the subject of intense debate among statisticians for almost 70 years. This article introduces some of the key elements of that debate and traces the appeal and adverse impact of this methodology to the P value fallacy, the mistaken idea that a single number can capture both the long-run outcomes of an experiment and the evidential meaning of a single result. This argument is made as a prelude to the suggestion that another measure of evidence should be used - the Bayes factor, which properly separates issues of long-run behavior from evidential strength and allows the integration of background knowledge with statistical findings.",

author = "Goodman, {Steven N.}",

year = "1999",

month = jun,

day = "15",

language = "English (US)",

volume = "130",

pages = "995--1004",

journal = "Annals of Internal Medicine",

issn = "0003-4819",

publisher = "American College of Physicians",

number = "12",

}

TY - JOUR

T1 - Toward evidence-based medical statistics. 1

T2 - The P value fallacy

AU - Goodman, Steven N.

PY - 1999/6/15

Y1 - 1999/6/15

N2 - An important problem exists in the interpretation of modern medical research data: Biological understanding and previous research play little formal role in the interpretation of quantitative results. This phenomenon is manifest in the discussion sections of research articles and ultimately can affect the reliability of conclusions. The standard statistical approach has created this situation by promoting the illusion that conclusions can be produced with certain 'error rates,' without consideration of information from outside the experiment. This statistical approach, the key components of which are P values and hypothesis tests, is widely perceived as a mathematically coherent approach to inference. There is little appreciation in the medical community that the methodology is an amalgam of incompatible elements, whose utility for scientific inference has been the subject of intense debate among statisticians for almost 70 years. This article introduces some of the key elements of that debate and traces the appeal and adverse impact of this methodology to the P value fallacy, the mistaken idea that a single number can capture both the long-run outcomes of an experiment and the evidential meaning of a single result. This argument is made as a prelude to the suggestion that another measure of evidence should be used - the Bayes factor, which properly separates issues of long-run behavior from evidential strength and allows the integration of background knowledge with statistical findings.

AB - An important problem exists in the interpretation of modern medical research data: Biological understanding and previous research play little formal role in the interpretation of quantitative results. This phenomenon is manifest in the discussion sections of research articles and ultimately can affect the reliability of conclusions. The standard statistical approach has created this situation by promoting the illusion that conclusions can be produced with certain 'error rates,' without consideration of information from outside the experiment. This statistical approach, the key components of which are P values and hypothesis tests, is widely perceived as a mathematically coherent approach to inference. There is little appreciation in the medical community that the methodology is an amalgam of incompatible elements, whose utility for scientific inference has been the subject of intense debate among statisticians for almost 70 years. This article introduces some of the key elements of that debate and traces the appeal and adverse impact of this methodology to the P value fallacy, the mistaken idea that a single number can capture both the long-run outcomes of an experiment and the evidential meaning of a single result. This argument is made as a prelude to the suggestion that another measure of evidence should be used - the Bayes factor, which properly separates issues of long-run behavior from evidential strength and allows the integration of background knowledge with statistical findings.

UR - http://www.scopus.com/inward/record.url?scp=0033564491&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033564491&partnerID=8YFLogxK

M3 - Article

C2 - 10383371

AN - SCOPUS:0033564491

SN - 0003-4819

VL - 130

SP - 995

EP - 1004

JO - Annals of Internal Medicine

JF - Annals of Internal Medicine

IS - 12

ER -

Toward evidence-based medical statistics. 1: The P value fallacy

Abstract

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this