Prediction of promoters and enhancers using multiple DNA methylation-associated features

Woochang Hwang, Verity F. Oliver, Shannath L. Merbs, Heng Zhu, Jiang Qian

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Background: Regulatory regions (e.g. promoters and enhancers) play an essential role in human development and disease. Many computational approaches have been developed to predict the regulatory regions using various genomic features such as sequence motifs and evolutionary conservation. However, these DNA sequence-based approaches do not reflect the tissue-specific nature of the regulatory regions. In this work, we propose to predict regulatory regions using multiple features derived from DNA methylation profile. Results: We discovered several interesting features of the methylated CpG (mCpG) sites within regulatory regions. First, a hypomethylation status of CpGs within regulatory regions, compared to the genomic background methylation level, extended out >1000 bp from the center of the regulatory regions, demonstrating a high degree of correlation between the methylation statuses of neighboring mCpG sites. Second, when a regulatory region was inactive, as determined by histone mark differences between cell lines, methylation level of the mCpG site increased from a hypomethylated state to a hypermethylated state, the level of which was even higher than the genomic background. Third, a distinct set of sequence motifs was overrepresented surrounding mCpG sites within regulatory regions. Using 5 types of features derived from DNA methylation profiles, we were able to predict promoters and enhancers using machine-learning approach (support vector machine). The performances for prediction of promoters and enhancers are quite well, showing an area under the ROC curve (AUC) of 0.992 and 0.817, respectively, which is better than that simply based on methylation level, especially for prediction of enhancers. Conclusions: Our study suggests that DNA methylation features of mCpG sites can be used to predict regulatory regions.

Original languageEnglish (US)
Article numberS11
JournalBMC genomics
Volume16
Issue number7
DOIs
StatePublished - Jun 11 2015

Keywords

  • DNA Methylation
  • Enhancer
  • Feature selection
  • Promoter
  • Regulatory region prediction
  • Support vector machine

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Fingerprint

Dive into the research topics of 'Prediction of promoters and enhancers using multiple DNA methylation-associated features'. Together they form a unique fingerprint.

Cite this