Multiple testing of local maxima for detection of peaks in ChIP-Seq data

Armin Schwartzman, Andrew Jaffe, Yulia Gavrilov, Clifford A. Meyer

Research output: Contribution to journalArticle

Abstract

A topological multiple testing approach to peak detection is proposed for the problem of detecting transcription factor binding sites in ChIP-Seq data. After kernel smoothing of the tag counts over the genome, the presence of a peak is tested at each observed local maximum, followed by multiple testing correction at the desired false discovery rate level. Valid p-values for candidate peaks are computed via Monte Carlo simulations of smoothed Poisson sequences, whose background Poisson rates are obtained via linear regression from a Control sample at two different scales. The proposed method identifies nearby binding sites that other methods do not.

Original languageEnglish (US)
Pages (from-to)471-494
Number of pages24
JournalAnnals of Applied Statistics
Volume7
Issue number1
DOIs
StatePublished - Mar 2013

Fingerprint

Multiple Testing
Binding sites
Siméon Denis Poisson
Chip
Kernel Smoothing
Transcription factors
Testing
p-Value
Transcription Factor
Linear regression
Count
Genome
Monte Carlo Simulation
Genes
Valid
Background
False
Monte Carlo simulation
P value
Tag

Keywords

  • False discovery rate
  • Kernel smoothing
  • Matched filter
  • Poisson sequence
  • Topological inference

ASJC Scopus subject areas

  • Statistics, Probability and Uncertainty
  • Modeling and Simulation
  • Statistics and Probability

Cite this

Multiple testing of local maxima for detection of peaks in ChIP-Seq data. / Schwartzman, Armin; Jaffe, Andrew; Gavrilov, Yulia; Meyer, Clifford A.

In: Annals of Applied Statistics, Vol. 7, No. 1, 03.2013, p. 471-494.

Research output: Contribution to journalArticle

Schwartzman, Armin ; Jaffe, Andrew ; Gavrilov, Yulia ; Meyer, Clifford A. / Multiple testing of local maxima for detection of peaks in ChIP-Seq data. In: Annals of Applied Statistics. 2013 ; Vol. 7, No. 1. pp. 471-494.
@article{3644b5a83ea04923b25609407aa27ae0,
title = "Multiple testing of local maxima for detection of peaks in ChIP-Seq data",
abstract = "A topological multiple testing approach to peak detection is proposed for the problem of detecting transcription factor binding sites in ChIP-Seq data. After kernel smoothing of the tag counts over the genome, the presence of a peak is tested at each observed local maximum, followed by multiple testing correction at the desired false discovery rate level. Valid p-values for candidate peaks are computed via Monte Carlo simulations of smoothed Poisson sequences, whose background Poisson rates are obtained via linear regression from a Control sample at two different scales. The proposed method identifies nearby binding sites that other methods do not.",
keywords = "False discovery rate, Kernel smoothing, Matched filter, Poisson sequence, Topological inference",
author = "Armin Schwartzman and Andrew Jaffe and Yulia Gavrilov and Meyer, {Clifford A.}",
year = "2013",
month = "3",
doi = "10.1214/12-AOAS594",
language = "English (US)",
volume = "7",
pages = "471--494",
journal = "Annals of Applied Statistics",
issn = "1932-6157",
publisher = "Institute of Mathematical Statistics",
number = "1",

}

TY - JOUR

T1 - Multiple testing of local maxima for detection of peaks in ChIP-Seq data

AU - Schwartzman, Armin

AU - Jaffe, Andrew

AU - Gavrilov, Yulia

AU - Meyer, Clifford A.

PY - 2013/3

Y1 - 2013/3

N2 - A topological multiple testing approach to peak detection is proposed for the problem of detecting transcription factor binding sites in ChIP-Seq data. After kernel smoothing of the tag counts over the genome, the presence of a peak is tested at each observed local maximum, followed by multiple testing correction at the desired false discovery rate level. Valid p-values for candidate peaks are computed via Monte Carlo simulations of smoothed Poisson sequences, whose background Poisson rates are obtained via linear regression from a Control sample at two different scales. The proposed method identifies nearby binding sites that other methods do not.

AB - A topological multiple testing approach to peak detection is proposed for the problem of detecting transcription factor binding sites in ChIP-Seq data. After kernel smoothing of the tag counts over the genome, the presence of a peak is tested at each observed local maximum, followed by multiple testing correction at the desired false discovery rate level. Valid p-values for candidate peaks are computed via Monte Carlo simulations of smoothed Poisson sequences, whose background Poisson rates are obtained via linear regression from a Control sample at two different scales. The proposed method identifies nearby binding sites that other methods do not.

KW - False discovery rate

KW - Kernel smoothing

KW - Matched filter

KW - Poisson sequence

KW - Topological inference

UR - http://www.scopus.com/inward/record.url?scp=84876035152&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84876035152&partnerID=8YFLogxK

U2 - 10.1214/12-AOAS594

DO - 10.1214/12-AOAS594

M3 - Article

AN - SCOPUS:84876035152

VL - 7

SP - 471

EP - 494

JO - Annals of Applied Statistics

JF - Annals of Applied Statistics

SN - 1932-6157

IS - 1

ER -