Experimental validation of predicted mammalian erythroid cis-regulatory modules

Hao Wang, Ying Zhang, Yong Cheng, Yuepin Zhou, David C. King, James Taylor, Francesca Chiaromonte, Jyotsna Kasturi, Hanna Petrykowska, Brian Gibb, Christine Dorman, Webb Miller, Louis C. Dore, John Welch, Mitchell J. Weiss, Ross C. Hardison

Research output: Contribution to journalArticle

Abstract

Multiple alignments of genome sequences are helpful guides to functional analysis, but predicting cis-regulatory modules (CRMs) accurately from such alignments remains an elusive goal. We predict CRMs for mammalian genes expressed in red blood cells by combining two properties gleaned from aligned, noncoding genome sequences: a positive regulatory potential (RP) score, which detects similarity to patterns in alignments distinctive for regulatory regions, and conservation of a binding site motif for the essential erythroid transcription factor GATA-1. Within eight target loci, we tested 75 noncoding segments by reporter gene assays in transiently transfected human K562 cells and/or after site-directed integration into murine erythroleukemia cells. Segments with a high RP score and a conserved exact match to the binding site consensus are validated at a good rate (50%-100%, with rates increasing at higher RP), whereas segments with lower RP scores or nonconsensus binding motifs tend to be inactive. Active DNA segments were shown to be occupied by GATA-1 protein by chromatin immunoprecipitation, whereas sites predicted to be inactive were not occupied. We verify four previously known erythroid CRMs and identify 28 novel ones. Thus, high RP in combination with another feature of a CRM, such as a conserved transcription factor binding site, is a good predictor of functional CRMs. Genome-wide predictions based on RP and a large set of well-defined transcription factor binding sites are available through servers at http://www.bx.psu.edu/.

Original languageEnglish (US)
Pages (from-to)1480-1492
Number of pages13
JournalGenome Research
Volume16
Issue number12
DOIs
StatePublished - Dec 2006
Externally publishedYes

Fingerprint

Binding Sites
Genome
GATA1 Transcription Factor
Transcription Factors
Leukemia, Erythroblastic, Acute
K562 Cells
Sequence Alignment
Gene Regulatory Networks
Chromatin Immunoprecipitation
Nucleic Acid Regulatory Sequences
Reporter Genes
Erythrocytes
DNA
Proteins

ASJC Scopus subject areas

  • Genetics

Cite this

Wang, H., Zhang, Y., Cheng, Y., Zhou, Y., King, D. C., Taylor, J., ... Hardison, R. C. (2006). Experimental validation of predicted mammalian erythroid cis-regulatory modules. Genome Research, 16(12), 1480-1492. https://doi.org/10.1101/gr.5353806

Experimental validation of predicted mammalian erythroid cis-regulatory modules. / Wang, Hao; Zhang, Ying; Cheng, Yong; Zhou, Yuepin; King, David C.; Taylor, James; Chiaromonte, Francesca; Kasturi, Jyotsna; Petrykowska, Hanna; Gibb, Brian; Dorman, Christine; Miller, Webb; Dore, Louis C.; Welch, John; Weiss, Mitchell J.; Hardison, Ross C.

In: Genome Research, Vol. 16, No. 12, 12.2006, p. 1480-1492.

Research output: Contribution to journalArticle

Wang, H, Zhang, Y, Cheng, Y, Zhou, Y, King, DC, Taylor, J, Chiaromonte, F, Kasturi, J, Petrykowska, H, Gibb, B, Dorman, C, Miller, W, Dore, LC, Welch, J, Weiss, MJ & Hardison, RC 2006, 'Experimental validation of predicted mammalian erythroid cis-regulatory modules', Genome Research, vol. 16, no. 12, pp. 1480-1492. https://doi.org/10.1101/gr.5353806
Wang, Hao ; Zhang, Ying ; Cheng, Yong ; Zhou, Yuepin ; King, David C. ; Taylor, James ; Chiaromonte, Francesca ; Kasturi, Jyotsna ; Petrykowska, Hanna ; Gibb, Brian ; Dorman, Christine ; Miller, Webb ; Dore, Louis C. ; Welch, John ; Weiss, Mitchell J. ; Hardison, Ross C. / Experimental validation of predicted mammalian erythroid cis-regulatory modules. In: Genome Research. 2006 ; Vol. 16, No. 12. pp. 1480-1492.
@article{ec385f726aca4f4eabd9adda106842cd,
title = "Experimental validation of predicted mammalian erythroid cis-regulatory modules",
abstract = "Multiple alignments of genome sequences are helpful guides to functional analysis, but predicting cis-regulatory modules (CRMs) accurately from such alignments remains an elusive goal. We predict CRMs for mammalian genes expressed in red blood cells by combining two properties gleaned from aligned, noncoding genome sequences: a positive regulatory potential (RP) score, which detects similarity to patterns in alignments distinctive for regulatory regions, and conservation of a binding site motif for the essential erythroid transcription factor GATA-1. Within eight target loci, we tested 75 noncoding segments by reporter gene assays in transiently transfected human K562 cells and/or after site-directed integration into murine erythroleukemia cells. Segments with a high RP score and a conserved exact match to the binding site consensus are validated at a good rate (50{\%}-100{\%}, with rates increasing at higher RP), whereas segments with lower RP scores or nonconsensus binding motifs tend to be inactive. Active DNA segments were shown to be occupied by GATA-1 protein by chromatin immunoprecipitation, whereas sites predicted to be inactive were not occupied. We verify four previously known erythroid CRMs and identify 28 novel ones. Thus, high RP in combination with another feature of a CRM, such as a conserved transcription factor binding site, is a good predictor of functional CRMs. Genome-wide predictions based on RP and a large set of well-defined transcription factor binding sites are available through servers at http://www.bx.psu.edu/.",
author = "Hao Wang and Ying Zhang and Yong Cheng and Yuepin Zhou and King, {David C.} and James Taylor and Francesca Chiaromonte and Jyotsna Kasturi and Hanna Petrykowska and Brian Gibb and Christine Dorman and Webb Miller and Dore, {Louis C.} and John Welch and Weiss, {Mitchell J.} and Hardison, {Ross C.}",
year = "2006",
month = "12",
doi = "10.1101/gr.5353806",
language = "English (US)",
volume = "16",
pages = "1480--1492",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "12",

}

TY - JOUR

T1 - Experimental validation of predicted mammalian erythroid cis-regulatory modules

AU - Wang, Hao

AU - Zhang, Ying

AU - Cheng, Yong

AU - Zhou, Yuepin

AU - King, David C.

AU - Taylor, James

AU - Chiaromonte, Francesca

AU - Kasturi, Jyotsna

AU - Petrykowska, Hanna

AU - Gibb, Brian

AU - Dorman, Christine

AU - Miller, Webb

AU - Dore, Louis C.

AU - Welch, John

AU - Weiss, Mitchell J.

AU - Hardison, Ross C.

PY - 2006/12

Y1 - 2006/12

N2 - Multiple alignments of genome sequences are helpful guides to functional analysis, but predicting cis-regulatory modules (CRMs) accurately from such alignments remains an elusive goal. We predict CRMs for mammalian genes expressed in red blood cells by combining two properties gleaned from aligned, noncoding genome sequences: a positive regulatory potential (RP) score, which detects similarity to patterns in alignments distinctive for regulatory regions, and conservation of a binding site motif for the essential erythroid transcription factor GATA-1. Within eight target loci, we tested 75 noncoding segments by reporter gene assays in transiently transfected human K562 cells and/or after site-directed integration into murine erythroleukemia cells. Segments with a high RP score and a conserved exact match to the binding site consensus are validated at a good rate (50%-100%, with rates increasing at higher RP), whereas segments with lower RP scores or nonconsensus binding motifs tend to be inactive. Active DNA segments were shown to be occupied by GATA-1 protein by chromatin immunoprecipitation, whereas sites predicted to be inactive were not occupied. We verify four previously known erythroid CRMs and identify 28 novel ones. Thus, high RP in combination with another feature of a CRM, such as a conserved transcription factor binding site, is a good predictor of functional CRMs. Genome-wide predictions based on RP and a large set of well-defined transcription factor binding sites are available through servers at http://www.bx.psu.edu/.

AB - Multiple alignments of genome sequences are helpful guides to functional analysis, but predicting cis-regulatory modules (CRMs) accurately from such alignments remains an elusive goal. We predict CRMs for mammalian genes expressed in red blood cells by combining two properties gleaned from aligned, noncoding genome sequences: a positive regulatory potential (RP) score, which detects similarity to patterns in alignments distinctive for regulatory regions, and conservation of a binding site motif for the essential erythroid transcription factor GATA-1. Within eight target loci, we tested 75 noncoding segments by reporter gene assays in transiently transfected human K562 cells and/or after site-directed integration into murine erythroleukemia cells. Segments with a high RP score and a conserved exact match to the binding site consensus are validated at a good rate (50%-100%, with rates increasing at higher RP), whereas segments with lower RP scores or nonconsensus binding motifs tend to be inactive. Active DNA segments were shown to be occupied by GATA-1 protein by chromatin immunoprecipitation, whereas sites predicted to be inactive were not occupied. We verify four previously known erythroid CRMs and identify 28 novel ones. Thus, high RP in combination with another feature of a CRM, such as a conserved transcription factor binding site, is a good predictor of functional CRMs. Genome-wide predictions based on RP and a large set of well-defined transcription factor binding sites are available through servers at http://www.bx.psu.edu/.

UR - http://www.scopus.com/inward/record.url?scp=33845316442&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33845316442&partnerID=8YFLogxK

U2 - 10.1101/gr.5353806

DO - 10.1101/gr.5353806

M3 - Article

C2 - 17038566

AN - SCOPUS:33845316442

VL - 16

SP - 1480

EP - 1492

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 12

ER -