Global Mapping of Transcription Factor Binding Sites by Sequencing Chromatin Surrogates: A Perspective on Experimental Design, Data Analysis, and Open Problems

Yingying Wei, George Wu, Hongkai Ji

Research output: Contribution to journalArticlepeer-review

Abstract

Mapping genome-wide binding sites of all transcription factors (TFs) in all biological contexts is a critical step toward understanding gene regulation. The state-of-the-art technologies for mapping transcription factor binding sites (TFBSs) couple chromatin immunoprecipitation (ChIP) with high-throughput sequencing (ChIP-seq) or tiling array hybridization (ChIP-chip). These technologies have limitations: they are low-throughput with respect to surveying many TFs. Recent advances in genome-wide chromatin profiling, including development of technologies such as DNase-seq, FAIRE-seq and ChIP-seq for histone modifications, make it possible to predict in vivo TFBSs by analyzing chromatin features at computationally determined DNA motif sites. This promising new approach may allow researchers to monitor the genome-wide binding sites of many TFs simultaneously. In this article, we discuss various experimental design and data analysis issues that arise when applying this approach. Through a systematic analysis of the data from the Encyclopedia Of DNA Elements (ENCODE) project, we compare the predictive power of individual and combinations of chromatin marks using supervised and unsupervised learning methods, and evaluate the value of integrating information from public ChIP and gene expression data. We also highlight the challenges and opportunities for developing novel analytical methods, such as resolving the one-motif-multiple-TF ambiguity and distinguishing functional and non-functional TF binding targets from the predicted binding sites.

Original languageEnglish (US)
Pages (from-to)156-178
Number of pages23
JournalStatistics in Biosciences
Volume5
Issue number1
DOIs
StatePublished - May 2013

Keywords

  • ChIP-seq
  • DNase-seq
  • FAIRE-seq
  • Motif
  • Next-generation sequencing
  • Transcription factor binding sites

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)

Fingerprint

Dive into the research topics of 'Global Mapping of Transcription Factor Binding Sites by Sequencing Chromatin Surrogates: A Perspective on Experimental Design, Data Analysis, and Open Problems'. Together they form a unique fingerprint.

Cite this