TY - JOUR
T1 - Eagle
T2 - An algorithm that utilizes a small number of genomic features to predict tissue/ cell type-specific enhancer-gene interactions
AU - Gao, Tianshun
AU - Qian, Jiang
N1 - Funding Information:
This work was supported by National Institutes of Health grants (EY024580, GM111514, EY029548, and EY001765 to JQ). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. We thank Drs. Minwen Hu, Jianbo Pan, and Jie Wang for their insightful comments.
Publisher Copyright:
© 2019 Gao, Qian. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2019
Y1 - 2019
N2 - Long-range regulation by distal enhancers is crucial for many biological processes. The existing methods for enhancer-target gene prediction often require many genomic features. This makes them difficult to be applied to many cell types, in which the relevant datasets are not always available. Here, we design a tool EAGLE, an enhancer and gene learning ensemble method for identification of Enhancer-Gene (EG) interactions. Unlike existing tools, EAGLE used only six features derived from the genomic features of enhancers and gene expression datasets. Cross-validation revealed that EAGLE outperformed other existing methods. Enrichment analyses on special transcriptional factors, epigenetic modifications, and eQTLs demonstrated that EAGLE could distinguish the interacting pairs from non- interacting ones. Finally, EAGLE was applied to mouse and human genomes and identified 7,680,203 and 7,437,255 EG interactions involving 31,375 and 43,724 genes, 138,547 and 177,062 enhancers across 89 and 110 tissue/cell types in mouse and human, respectively. The obtained interactions are accessible through an interactive database enhanceratlas.org. The EAGLE method is available at https://github.com/EvansGao/ EAGLE and the predicted datasets are available in http://www.enhanceratlas.org/.
AB - Long-range regulation by distal enhancers is crucial for many biological processes. The existing methods for enhancer-target gene prediction often require many genomic features. This makes them difficult to be applied to many cell types, in which the relevant datasets are not always available. Here, we design a tool EAGLE, an enhancer and gene learning ensemble method for identification of Enhancer-Gene (EG) interactions. Unlike existing tools, EAGLE used only six features derived from the genomic features of enhancers and gene expression datasets. Cross-validation revealed that EAGLE outperformed other existing methods. Enrichment analyses on special transcriptional factors, epigenetic modifications, and eQTLs demonstrated that EAGLE could distinguish the interacting pairs from non- interacting ones. Finally, EAGLE was applied to mouse and human genomes and identified 7,680,203 and 7,437,255 EG interactions involving 31,375 and 43,724 genes, 138,547 and 177,062 enhancers across 89 and 110 tissue/cell types in mouse and human, respectively. The obtained interactions are accessible through an interactive database enhanceratlas.org. The EAGLE method is available at https://github.com/EvansGao/ EAGLE and the predicted datasets are available in http://www.enhanceratlas.org/.
UR - http://www.scopus.com/inward/record.url?scp=85074326977&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074326977&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1007436
DO - 10.1371/journal.pcbi.1007436
M3 - Article
C2 - 31665135
AN - SCOPUS:85074326977
SN - 1553-734X
VL - 15
JO - PLoS computational biology
JF - PLoS computational biology
IS - 10
M1 - e1007436
ER -