Regulatory component analysis

A semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge

Chen Wang, Jianhua Xuan, Ie Ming Shih, Robert Clarke, Yue Wang

Research output: Contribution to journalArticle

Abstract

With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise ratio (SNR) is low but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on Escherichia coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm.

Original languageEnglish (US)
Pages (from-to)1902-1915
Number of pages14
JournalSignal Processing
Volume92
Issue number8
DOIs
StatePublished - Aug 2012

Fingerprint

Network components
Genes
Gene expression
Transcription factors
Independent component analysis
Biotechnology
Escherichia coli
Signal to noise ratio
Throughput
Proteins
Monitoring
Computer simulation
Experiments
Systems Biology

Keywords

  • Gene expression
  • Genomic signal processing
  • Source extraction
  • Transcriptional regulatory network inference

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Control and Systems Engineering
  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition

Cite this

Regulatory component analysis : A semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge. / Wang, Chen; Xuan, Jianhua; Shih, Ie Ming; Clarke, Robert; Wang, Yue.

In: Signal Processing, Vol. 92, No. 8, 08.2012, p. 1902-1915.

Research output: Contribution to journalArticle

@article{ae8e02d42cc34bc3a998eb1f91201d4c,
title = "Regulatory component analysis: A semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge",
abstract = "With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise ratio (SNR) is low but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on Escherichia coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm.",
keywords = "Gene expression, Genomic signal processing, Source extraction, Transcriptional regulatory network inference",
author = "Chen Wang and Jianhua Xuan and Shih, {Ie Ming} and Robert Clarke and Yue Wang",
year = "2012",
month = "8",
doi = "10.1016/j.sigpro.2011.11.028",
language = "English (US)",
volume = "92",
pages = "1902--1915",
journal = "Signal Processing",
issn = "0165-1684",
publisher = "Elsevier",
number = "8",

}

TY - JOUR

T1 - Regulatory component analysis

T2 - A semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge

AU - Wang, Chen

AU - Xuan, Jianhua

AU - Shih, Ie Ming

AU - Clarke, Robert

AU - Wang, Yue

PY - 2012/8

Y1 - 2012/8

N2 - With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise ratio (SNR) is low but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on Escherichia coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm.

AB - With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise ratio (SNR) is low but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on Escherichia coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm.

KW - Gene expression

KW - Genomic signal processing

KW - Source extraction

KW - Transcriptional regulatory network inference

UR - http://www.scopus.com/inward/record.url?scp=84858070004&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84858070004&partnerID=8YFLogxK

U2 - 10.1016/j.sigpro.2011.11.028

DO - 10.1016/j.sigpro.2011.11.028

M3 - Article

VL - 92

SP - 1902

EP - 1915

JO - Signal Processing

JF - Signal Processing

SN - 0165-1684

IS - 8

ER -