TY - JOUR
T1 - Construction and analysis of an integrated regulatory network derived from High-Throughput sequencing data
AU - Cheng, Chao
AU - Yan, Koon Kiu
AU - Hwang, Woochang
AU - Qian, Jiang
AU - Bhardwaj, Nitin
AU - Rozowsky, Joel
AU - Lu, Zhi John
AU - Niu, Wei
AU - Alves, Pedro
AU - Kato, Masaomi
AU - Snyder, Michael
AU - Gerstein, Mark
PY - 2011/11
Y1 - 2011/11
N2 - We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We identified the target genes and target miRNAs for a set of TFs based on the ChIP-Seq binding profiles, the predicted targets of miRNAs using annotated 3′UTR sequences and conservation information. Making use of the system-wide RNA-Seq profiles, we classified transcription factors into positive and negative regulators and assigned a sign for each regulatory interaction. Other types of edges such as protein-protein interactions and potential intra-regulations between miRNAs based on the embedding of miRNAs in their host genes were further incorporated. We examined the topological structures of the network, including its hierarchical organization and motif enrichment. We found that transcription factors downstream of the hierarchy distinguish themselves by expressing more uniformly at various tissues, have more interacting partners, and are more likely to be essential. We found an over-representation of notable network motifs, including a FFL in which a miRNA cost-effectively shuts down a transcription factor and its target. We used data of C. elegans from the modENCODE project as a primary model to illustrate our framework, but further verified the results using other two data sets. As more and more genome-wide ChIP-Seq and RNA-Seq data becomes available in the near future, our methods of data integration have various potential applications.
AB - We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We identified the target genes and target miRNAs for a set of TFs based on the ChIP-Seq binding profiles, the predicted targets of miRNAs using annotated 3′UTR sequences and conservation information. Making use of the system-wide RNA-Seq profiles, we classified transcription factors into positive and negative regulators and assigned a sign for each regulatory interaction. Other types of edges such as protein-protein interactions and potential intra-regulations between miRNAs based on the embedding of miRNAs in their host genes were further incorporated. We examined the topological structures of the network, including its hierarchical organization and motif enrichment. We found that transcription factors downstream of the hierarchy distinguish themselves by expressing more uniformly at various tissues, have more interacting partners, and are more likely to be essential. We found an over-representation of notable network motifs, including a FFL in which a miRNA cost-effectively shuts down a transcription factor and its target. We used data of C. elegans from the modENCODE project as a primary model to illustrate our framework, but further verified the results using other two data sets. As more and more genome-wide ChIP-Seq and RNA-Seq data becomes available in the near future, our methods of data integration have various potential applications.
UR - http://www.scopus.com/inward/record.url?scp=81355164253&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=81355164253&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1002190
DO - 10.1371/journal.pcbi.1002190
M3 - Article
C2 - 22125477
AN - SCOPUS:81355164253
SN - 1553-734X
VL - 7
JO - PLoS computational biology
JF - PLoS computational biology
IS - 11
M1 - e1002190
ER -