Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility

Sheng Liu, Cristina Zibetti, Jun Wan, Guohua Wang, Seth Blackshaw, Jiang Qian

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Computational prediction of transcription factor (TF) binding sites in different cell types is challenging. Recent technology development allows us to determine the genome-wide chromatin accessibility in various cellular and developmental contexts. The chromatin accessibility profiles provide useful information in prediction of TF binding events in various physiological conditions. Furthermore, ChIP-Seq analysis was used to determine genome-wide binding sites for a range of different TFs in multiple cell types. Integration of these two types of genomic information can improve the prediction of TF binding events. Results: We assessed to what extent a model built upon on other TFs and/or other cell types could be used to predict the binding sites of TFs of interest. A random forest model was built using a set of cell type-independent features such as specific sequences recognized by the TFs and evolutionary conservation, as well as cell type-specific features derived from chromatin accessibility data. Our analysis suggested that the models learned from other TFs and/or cell lines performed almost as well as the model learned from the target TF in the cell type of interest. Interestingly, models based on multiple TFs performed better than single-TF models. Finally, we proposed a universal model, BPAC, which was generated using ChIP-Seq data from multiple TFs in various cell types. Conclusion: Integrating chromatin accessibility information with sequence information improves prediction of TF binding.The prediction of TF binding is transferable across TFs and/or cell lines suggesting there are a set of universal "rules". A computational tool was developed to predict TF binding sites based on the universal "rules".

Original languageEnglish (US)
Article number355
JournalBMC Bioinformatics
Volume18
Issue number1
DOIs
StatePublished - Jul 27 2017

Keywords

  • Chromatin accessibility
  • Feature selection
  • Machine learning
  • Transcription factor binding prediction

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility'. Together they form a unique fingerprint.

Cite this