TY - JOUR
T1 - Landscape of allele-specific transcription factor binding in the human genome
AU - Abramov, Sergey
AU - Boytsov, Alexandr
AU - Bykova, Daria
AU - Penzar, Dmitry D.
AU - Yevshin, Ivan
AU - Kolmykov, Semyon K.
AU - Fridman, Marina V.
AU - Favorov, Alexander V.
AU - Vorontsov, Ilya E.
AU - Baulin, Eugene
AU - Kolpakov, Fedor
AU - Makeev, Vsevolod J.
AU - Kulakovskiy, Ivan V.
N1 - Funding Information:
We thank the organizers and members of the GRECO consortium for the series of workshops (held under European Union COST Action CA15205—GREEKС, coordinator Martin Kuiper) which provided a fruitful networking and discussion platform for ideas of this study. We personally thank Denis Litvinov for help in GTRD metadata processing and Evgenia Serebrova for help in paper preparation. This study was supported by RFBR grant 18-34-20024 to I.V.K. (basic ADASTRA pipeline), RSF grant 20-74-10075 to I.V.K. (machine learning and additional analysis), RSF grant 19-14-00295 to F.K. (GTRD data extraction).
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/5/12
Y1 - 2021/5/12
N2 - Sequence variants in gene regulatory regions alter gene expression and contribute to phenotypes of individual cells and the whole organism, including disease susceptibility and progression. Single-nucleotide variants in enhancers or promoters may affect gene transcription by altering transcription factor binding sites. Differential transcription factor binding in heterozygous genomic loci provides a natural source of information on such regulatory variants. We present a novel approach to call the allele-specific transcription factor binding events at single-nucleotide variants in ChIP-Seq data, taking into account the joint contribution of aneuploidy and local copy number variation, that is estimated directly from variant calls. We have conducted a meta-analysis of more than 7 thousand ChIP-Seq experiments and assembled the database of allele-specific binding events listing more than half a million entries at nearly 270 thousand single-nucleotide polymorphisms for several hundred human transcription factors and cell types. These polymorphisms are enriched for associations with phenotypes of medical relevance and often overlap eQTLs, making candidates for causality by linking variants with molecular mechanisms. Specifically, there is a special class of switching sites, where different transcription factors preferably bind alternative alleles, thus revealing allele-specific rewiring of molecular circuitry.
AB - Sequence variants in gene regulatory regions alter gene expression and contribute to phenotypes of individual cells and the whole organism, including disease susceptibility and progression. Single-nucleotide variants in enhancers or promoters may affect gene transcription by altering transcription factor binding sites. Differential transcription factor binding in heterozygous genomic loci provides a natural source of information on such regulatory variants. We present a novel approach to call the allele-specific transcription factor binding events at single-nucleotide variants in ChIP-Seq data, taking into account the joint contribution of aneuploidy and local copy number variation, that is estimated directly from variant calls. We have conducted a meta-analysis of more than 7 thousand ChIP-Seq experiments and assembled the database of allele-specific binding events listing more than half a million entries at nearly 270 thousand single-nucleotide polymorphisms for several hundred human transcription factors and cell types. These polymorphisms are enriched for associations with phenotypes of medical relevance and often overlap eQTLs, making candidates for causality by linking variants with molecular mechanisms. Specifically, there is a special class of switching sites, where different transcription factors preferably bind alternative alleles, thus revealing allele-specific rewiring of molecular circuitry.
UR - http://www.scopus.com/inward/record.url?scp=85105781901&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85105781901&partnerID=8YFLogxK
U2 - 10.1038/s41467-021-23007-0
DO - 10.1038/s41467-021-23007-0
M3 - Article
C2 - 33980847
AN - SCOPUS:85105781901
SN - 2041-1723
VL - 12
JO - Nature communications
JF - Nature communications
IS - 1
M1 - 2751
ER -