TY - JOUR
T1 - Characterization of Binding Sites of Eukaryotic Transcription Factors
AU - Qian, Jiang
AU - Lin, Jimmy
AU - Zack, Donald J.
N1 - Funding Information:
We thank Dongmei Liu and Dr. Giovanni Parmigiani for stimulating discussions. The research is supported in part by grants from the National Eye Institute and the Foundation Fighting Blindness, and by a generous gift from Mr. Robert Smith and Mrs. Clarice Smith. DJZ is the Guerrieri Professor of Genetic Engineering and Molecular Ophthalmology and the recipient of a Research to Prevent Blindness Senior Investigator Award.
PY - 2006/5
Y1 - 2006/5
N2 - To explore the nature of eukaryotic transcription factor (TF) binding sites and determine how they differ from surrounding DNA sequences, we examined four features associated with DNA binding sites: G+C content, pattern complexity, palindromic structure, and Markov sequence ordering. Our analysis of the regulatory motifs obtained from the TRANSFAC database, using yeast intergenic sequences as background, revealed that these four features show variable enrichment in motif sequences. For example, motif sequences were more likely to have palindromic structure than were background sequences. In addition, these features were tightly localized to the regulatory motifs, indicating that they are a property of the motif sequences themselves and are not shared by the general promoter "environment" in which the regulatory motifs reside. By breaking down the motif sequences according to the TF classes to which they bind, more specific associations were identified. Finally, we found that some correlations, such as G+C content enrichment, were species-specific, while others, such as complexity enrichment, were universal across the species examined. The quantitative analysis provided here should increase our understanding of protein-DNA interactions and also help facilitate the discovery of regulatory motifs through bioinformatics.
AB - To explore the nature of eukaryotic transcription factor (TF) binding sites and determine how they differ from surrounding DNA sequences, we examined four features associated with DNA binding sites: G+C content, pattern complexity, palindromic structure, and Markov sequence ordering. Our analysis of the regulatory motifs obtained from the TRANSFAC database, using yeast intergenic sequences as background, revealed that these four features show variable enrichment in motif sequences. For example, motif sequences were more likely to have palindromic structure than were background sequences. In addition, these features were tightly localized to the regulatory motifs, indicating that they are a property of the motif sequences themselves and are not shared by the general promoter "environment" in which the regulatory motifs reside. By breaking down the motif sequences according to the TF classes to which they bind, more specific associations were identified. Finally, we found that some correlations, such as G+C content enrichment, were species-specific, while others, such as complexity enrichment, were universal across the species examined. The quantitative analysis provided here should increase our understanding of protein-DNA interactions and also help facilitate the discovery of regulatory motifs through bioinformatics.
KW - bioinformatics
KW - gene regulation
KW - promoter
KW - transcription factor
UR - http://www.scopus.com/inward/record.url?scp=33747373399&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33747373399&partnerID=8YFLogxK
U2 - 10.1016/S1672-0229(06)60019-3
DO - 10.1016/S1672-0229(06)60019-3
M3 - Article
C2 - 16970547
AN - SCOPUS:33747373399
SN - 1672-0229
VL - 4
SP - 67
EP - 79
JO - Genomics, Proteomics and Bioinformatics
JF - Genomics, Proteomics and Bioinformatics
IS - 2
ER -