Efficient p-value evaluation for resampling-based tests

Kai Yu, Faming Liang, Julia Ciampa, Nilanjan Chatterjee

Research output: Contribution to journalArticle

Abstract

The resampling-based test, which often relies on permutation or bootstrap procedures, has been widely used for statistical hypothesis testing when the asymptotic distribution of the test statistic is unavailable or unreliable. It requires repeated calculations of the test statistic on a large number of simulated data sets for its significance level assessment, and thus it could become very computationally intensive. Here, we propose an efficient p-value evaluation procedure by adapting the stochastic approximation Markov chain Monte Carlo algorithm. The new procedure can be used easily for estimating the p-value for any resampling-based test. We show through numeric simulations that the proposed procedure can be 100-500 000 times as efficient (in term of computing time) as the standard resampling-based procedure when evaluating a test statistic with a small p-value (e.g. less than 10-6). With its computational burden reduced by this proposed procedure, the versatile resampling-based test would become computationally feasible for a much wider range of applications. We demonstrate the application of the new method by applying it to a large-scale genetic association study of prostate cancer.

Original languageEnglish (US)
Pages (from-to)582-593
Number of pages12
JournalBiostatistics
Volume12
Issue number3
DOIs
StatePublished - Jul 2011
Externally publishedYes

Fingerprint

Resampling
p-Value
Evaluation
Test Statistic
Genetic Association
Prostate Cancer
Markov Chain Monte Carlo Algorithms
Significance level
Stochastic Approximation
Markov Chains
Hypothesis Testing
Numerics
Bootstrap
Genetic Association Studies
Asymptotic distribution
P value
Permutation
Prostatic Neoplasms
Computing
Term

Keywords

  • Bootstrap procedures
  • Genetic association studies
  • p-value
  • Resampling-based tests
  • Stochastic approximation Markov chain Monte Carlo

ASJC Scopus subject areas

  • Medicine(all)
  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Efficient p-value evaluation for resampling-based tests. / Yu, Kai; Liang, Faming; Ciampa, Julia; Chatterjee, Nilanjan.

In: Biostatistics, Vol. 12, No. 3, 07.2011, p. 582-593.

Research output: Contribution to journalArticle

Yu, Kai ; Liang, Faming ; Ciampa, Julia ; Chatterjee, Nilanjan. / Efficient p-value evaluation for resampling-based tests. In: Biostatistics. 2011 ; Vol. 12, No. 3. pp. 582-593.
@article{e9cbf42256e34b29a4e2baaf09587389,
title = "Efficient p-value evaluation for resampling-based tests",
abstract = "The resampling-based test, which often relies on permutation or bootstrap procedures, has been widely used for statistical hypothesis testing when the asymptotic distribution of the test statistic is unavailable or unreliable. It requires repeated calculations of the test statistic on a large number of simulated data sets for its significance level assessment, and thus it could become very computationally intensive. Here, we propose an efficient p-value evaluation procedure by adapting the stochastic approximation Markov chain Monte Carlo algorithm. The new procedure can be used easily for estimating the p-value for any resampling-based test. We show through numeric simulations that the proposed procedure can be 100-500 000 times as efficient (in term of computing time) as the standard resampling-based procedure when evaluating a test statistic with a small p-value (e.g. less than 10-6). With its computational burden reduced by this proposed procedure, the versatile resampling-based test would become computationally feasible for a much wider range of applications. We demonstrate the application of the new method by applying it to a large-scale genetic association study of prostate cancer.",
keywords = "Bootstrap procedures, Genetic association studies, p-value, Resampling-based tests, Stochastic approximation Markov chain Monte Carlo",
author = "Kai Yu and Faming Liang and Julia Ciampa and Nilanjan Chatterjee",
year = "2011",
month = "7",
doi = "10.1093/biostatistics/kxq078",
language = "English (US)",
volume = "12",
pages = "582--593",
journal = "Biostatistics",
issn = "1465-4644",
publisher = "Oxford University Press",
number = "3",

}

TY - JOUR

T1 - Efficient p-value evaluation for resampling-based tests

AU - Yu, Kai

AU - Liang, Faming

AU - Ciampa, Julia

AU - Chatterjee, Nilanjan

PY - 2011/7

Y1 - 2011/7

N2 - The resampling-based test, which often relies on permutation or bootstrap procedures, has been widely used for statistical hypothesis testing when the asymptotic distribution of the test statistic is unavailable or unreliable. It requires repeated calculations of the test statistic on a large number of simulated data sets for its significance level assessment, and thus it could become very computationally intensive. Here, we propose an efficient p-value evaluation procedure by adapting the stochastic approximation Markov chain Monte Carlo algorithm. The new procedure can be used easily for estimating the p-value for any resampling-based test. We show through numeric simulations that the proposed procedure can be 100-500 000 times as efficient (in term of computing time) as the standard resampling-based procedure when evaluating a test statistic with a small p-value (e.g. less than 10-6). With its computational burden reduced by this proposed procedure, the versatile resampling-based test would become computationally feasible for a much wider range of applications. We demonstrate the application of the new method by applying it to a large-scale genetic association study of prostate cancer.

AB - The resampling-based test, which often relies on permutation or bootstrap procedures, has been widely used for statistical hypothesis testing when the asymptotic distribution of the test statistic is unavailable or unreliable. It requires repeated calculations of the test statistic on a large number of simulated data sets for its significance level assessment, and thus it could become very computationally intensive. Here, we propose an efficient p-value evaluation procedure by adapting the stochastic approximation Markov chain Monte Carlo algorithm. The new procedure can be used easily for estimating the p-value for any resampling-based test. We show through numeric simulations that the proposed procedure can be 100-500 000 times as efficient (in term of computing time) as the standard resampling-based procedure when evaluating a test statistic with a small p-value (e.g. less than 10-6). With its computational burden reduced by this proposed procedure, the versatile resampling-based test would become computationally feasible for a much wider range of applications. We demonstrate the application of the new method by applying it to a large-scale genetic association study of prostate cancer.

KW - Bootstrap procedures

KW - Genetic association studies

KW - p-value

KW - Resampling-based tests

KW - Stochastic approximation Markov chain Monte Carlo

UR - http://www.scopus.com/inward/record.url?scp=79959416224&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959416224&partnerID=8YFLogxK

U2 - 10.1093/biostatistics/kxq078

DO - 10.1093/biostatistics/kxq078

M3 - Article

C2 - 21209154

AN - SCOPUS:79959416224

VL - 12

SP - 582

EP - 593

JO - Biostatistics

JF - Biostatistics

SN - 1465-4644

IS - 3

ER -