The GC skew near Pol II start sites and its association with SP1-binding site variants

Yu A. Medvedeva, I. V. Kulakovskii, N. Yu Oparina, A. V. Favorov, V. Yu Makeev

Research output: Contribution to journalArticlepeer-review


Nucleotide sequences of DNA within clusters of transcription start sites identified by the Cap Analysis of Gene Expression (CAGE) have some distinctive features. DNA within such clusters is enriched in cytosine and guanine, and its GC-skew agrees with selection of the coding strand for which the G content exceeds the C content. On the other hand, for the coding strand the frequency of tracts of the avoided cytosine, normalized to the expectation calculated from the local content of the nucleotide in the cluster, is significantly higher than that of the tracts of the preferred guanine. Similarly, the statistical significance of the C-rich variant of binding site for transcription factor Sp1 in the coding strand is higher than that of the G-rich variant. Yet it is unlikely that the choice of the Sp1 site variant is induced by the coding strand selection. Rather, it is more likely that both variants are more or less equiprobable, and the Sp1 functional binding works as a selection factor, which counteracts the mutations bringing about the GC-skew.

Original languageEnglish (US)
Pages (from-to)901-907
Number of pages7
Issue number6
StatePublished - Dec 2010
Externally publishedYes


  • CAGE
  • Homo sapiens
  • Sp1
  • cap analysis of gene expression
  • transcription factor

ASJC Scopus subject areas

  • Biophysics


Dive into the research topics of 'The GC skew near Pol II start sites and its association with SP1-binding site variants'. Together they form a unique fingerprint.

Cite this