p values, hypothesis tests, and likelihood

Implications for epidemiology of a neglected historical debate

Steven N. Goodman

Research output: Contribution to journalArticle

Abstract

It is not generally appreciated that the p value, as conceived by R. A. Fisher, is not compatible with the Neyman-Pearson hypothesis test in which it has become embedded. The p value was meant to be a flexible inferential measure, whereas the hypothesis test was a rule for behavior, not inference. The combination of the two methods has led to a reinterpretation of the p value simultaneously as an "observed error rate" and as a measure of evidence. Both of these interpretations are problematic, and their combination has obscured the important differences between Neyman and Fisher on the nature of the scientific method and inhibited our understanding of the philosophic implications of the basic methods in use today. An analysis using another method promoted by Fisher, mathematical likelihood, shows that the p value substantially overstates the evidence against the null hypothesis. Likelihood makes clearer the distinction between error rates and inferential evidence and is a quantitative tool for expressing evidential strength that is more appropriate for the purposes of epidemiology than the p value.

Original languageEnglish (US)
Pages (from-to)485-496
Number of pages12
JournalAmerican Journal of Epidemiology
Volume137
Issue number5
StatePublished - Mar 1 1993

Fingerprint

Epidemiology

Keywords

  • Hypothesis tests
  • Inference
  • Likelihood
  • P values
  • Significance tests

ASJC Scopus subject areas

  • Geriatrics and Gerontology
  • Epidemiology

Cite this

p values, hypothesis tests, and likelihood : Implications for epidemiology of a neglected historical debate. / Goodman, Steven N.

In: American Journal of Epidemiology, Vol. 137, No. 5, 01.03.1993, p. 485-496.

Research output: Contribution to journalArticle

@article{3bbeea40355e4c69a0cea2f4ddc211bb,
title = "p values, hypothesis tests, and likelihood: Implications for epidemiology of a neglected historical debate",
abstract = "It is not generally appreciated that the p value, as conceived by R. A. Fisher, is not compatible with the Neyman-Pearson hypothesis test in which it has become embedded. The p value was meant to be a flexible inferential measure, whereas the hypothesis test was a rule for behavior, not inference. The combination of the two methods has led to a reinterpretation of the p value simultaneously as an {"}observed error rate{"} and as a measure of evidence. Both of these interpretations are problematic, and their combination has obscured the important differences between Neyman and Fisher on the nature of the scientific method and inhibited our understanding of the philosophic implications of the basic methods in use today. An analysis using another method promoted by Fisher, mathematical likelihood, shows that the p value substantially overstates the evidence against the null hypothesis. Likelihood makes clearer the distinction between error rates and inferential evidence and is a quantitative tool for expressing evidential strength that is more appropriate for the purposes of epidemiology than the p value.",
keywords = "Hypothesis tests, Inference, Likelihood, P values, Significance tests",
author = "Goodman, {Steven N.}",
year = "1993",
month = "3",
day = "1",
language = "English (US)",
volume = "137",
pages = "485--496",
journal = "American Journal of Epidemiology",
issn = "0002-9262",
publisher = "Oxford University Press",
number = "5",

}

TY - JOUR

T1 - p values, hypothesis tests, and likelihood

T2 - Implications for epidemiology of a neglected historical debate

AU - Goodman, Steven N.

PY - 1993/3/1

Y1 - 1993/3/1

N2 - It is not generally appreciated that the p value, as conceived by R. A. Fisher, is not compatible with the Neyman-Pearson hypothesis test in which it has become embedded. The p value was meant to be a flexible inferential measure, whereas the hypothesis test was a rule for behavior, not inference. The combination of the two methods has led to a reinterpretation of the p value simultaneously as an "observed error rate" and as a measure of evidence. Both of these interpretations are problematic, and their combination has obscured the important differences between Neyman and Fisher on the nature of the scientific method and inhibited our understanding of the philosophic implications of the basic methods in use today. An analysis using another method promoted by Fisher, mathematical likelihood, shows that the p value substantially overstates the evidence against the null hypothesis. Likelihood makes clearer the distinction between error rates and inferential evidence and is a quantitative tool for expressing evidential strength that is more appropriate for the purposes of epidemiology than the p value.

AB - It is not generally appreciated that the p value, as conceived by R. A. Fisher, is not compatible with the Neyman-Pearson hypothesis test in which it has become embedded. The p value was meant to be a flexible inferential measure, whereas the hypothesis test was a rule for behavior, not inference. The combination of the two methods has led to a reinterpretation of the p value simultaneously as an "observed error rate" and as a measure of evidence. Both of these interpretations are problematic, and their combination has obscured the important differences between Neyman and Fisher on the nature of the scientific method and inhibited our understanding of the philosophic implications of the basic methods in use today. An analysis using another method promoted by Fisher, mathematical likelihood, shows that the p value substantially overstates the evidence against the null hypothesis. Likelihood makes clearer the distinction between error rates and inferential evidence and is a quantitative tool for expressing evidential strength that is more appropriate for the purposes of epidemiology than the p value.

KW - Hypothesis tests

KW - Inference

KW - Likelihood

KW - P values

KW - Significance tests

UR - http://www.scopus.com/inward/record.url?scp=0027471290&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0027471290&partnerID=8YFLogxK

M3 - Article

VL - 137

SP - 485

EP - 496

JO - American Journal of Epidemiology

JF - American Journal of Epidemiology

SN - 0002-9262

IS - 5

ER -