Robust kernel canonical correlation analysis to detect gene-gene co-Associations: A case study in genetics

M. Ashad Alam, Osamu Komori, Hong Wen Deng, Vince D. Calhoun, Yu Ping Wang

Research output: Contribution to journalArticle

Abstract

The kernel canonical correlation analysis based U-statistic (KCCU) is being used to detect nonlinear gene-gene co-Associations. Estimating the variance of the KCCU is however computationally intensive. In addition, the kernel canonical correlation analysis (kernel CCA) is not robust to contaminated data. Using a robust kernel mean element and a robust kernel (cross)-covariance operator potentially enables the use of a robust kernel CCA, which is studied in this paper. We first propose an influence function-based estimator for the variance of the KCCU. We then present a non-parametric robust KCCU, which is designed for dealing with contaminated data. The robust KCCU is less sensitive to noise than KCCU. We investigate the proposed method using both synthesized and real data from the Mind Clinical Imaging Consortium (MCIC). We show through simulation studies that the power of the proposed methods is a monotonically increasing function of sample size, and the robust test statistics bring incremental gains in power. To demonstrate the advantage of the robust kernel CCA, we study MCIC data among 22,442 candidate Schizophrenia genes for gene-gene co-Associations. We select 768 genes with strong evidence for shedding light on gene-gene interaction networks for Schizophrenia. By performing gene ontology enrichment analysis, pathway analysis, gene-gene network and other studies, the proposed robust methods can find undiscovered genes in addition to significant gene pairs, and demonstrate superior performance over several of current approaches.

Original languageEnglish (US)
Article number1950028
JournalJournal of Bioinformatics and Computational Biology
Volume17
Issue number4
DOIs
StatePublished - Aug 1 2019
Externally publishedYes

Fingerprint

Genes
Statistics
Gene Regulatory Networks
Medical imaging
Schizophrenia
Genetics
Gene Ontology
Sample Size
Noise
Ontology

Keywords

  • gene-gene co-Association
  • kernel methods
  • robust kernel canonical correlation analysis
  • Robustness

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications

Cite this

Robust kernel canonical correlation analysis to detect gene-gene co-Associations : A case study in genetics. / Ashad Alam, M.; Komori, Osamu; Deng, Hong Wen; Calhoun, Vince D.; Wang, Yu Ping.

In: Journal of Bioinformatics and Computational Biology, Vol. 17, No. 4, 1950028, 01.08.2019.

Research output: Contribution to journalArticle

@article{aba3e0f176094a2ea3a9c567e7885c33,
title = "Robust kernel canonical correlation analysis to detect gene-gene co-Associations: A case study in genetics",
abstract = "The kernel canonical correlation analysis based U-statistic (KCCU) is being used to detect nonlinear gene-gene co-Associations. Estimating the variance of the KCCU is however computationally intensive. In addition, the kernel canonical correlation analysis (kernel CCA) is not robust to contaminated data. Using a robust kernel mean element and a robust kernel (cross)-covariance operator potentially enables the use of a robust kernel CCA, which is studied in this paper. We first propose an influence function-based estimator for the variance of the KCCU. We then present a non-parametric robust KCCU, which is designed for dealing with contaminated data. The robust KCCU is less sensitive to noise than KCCU. We investigate the proposed method using both synthesized and real data from the Mind Clinical Imaging Consortium (MCIC). We show through simulation studies that the power of the proposed methods is a monotonically increasing function of sample size, and the robust test statistics bring incremental gains in power. To demonstrate the advantage of the robust kernel CCA, we study MCIC data among 22,442 candidate Schizophrenia genes for gene-gene co-Associations. We select 768 genes with strong evidence for shedding light on gene-gene interaction networks for Schizophrenia. By performing gene ontology enrichment analysis, pathway analysis, gene-gene network and other studies, the proposed robust methods can find undiscovered genes in addition to significant gene pairs, and demonstrate superior performance over several of current approaches.",
keywords = "gene-gene co-Association, kernel methods, robust kernel canonical correlation analysis, Robustness",
author = "{Ashad Alam}, M. and Osamu Komori and Deng, {Hong Wen} and Calhoun, {Vince D.} and Wang, {Yu Ping}",
year = "2019",
month = "8",
day = "1",
doi = "10.1142/S0219720019500288",
language = "English (US)",
volume = "17",
journal = "Journal of Bioinformatics and Computational Biology",
issn = "0219-7200",
publisher = "World Scientific Publishing Co. Pte Ltd",
number = "4",

}

TY - JOUR

T1 - Robust kernel canonical correlation analysis to detect gene-gene co-Associations

T2 - A case study in genetics

AU - Ashad Alam, M.

AU - Komori, Osamu

AU - Deng, Hong Wen

AU - Calhoun, Vince D.

AU - Wang, Yu Ping

PY - 2019/8/1

Y1 - 2019/8/1

N2 - The kernel canonical correlation analysis based U-statistic (KCCU) is being used to detect nonlinear gene-gene co-Associations. Estimating the variance of the KCCU is however computationally intensive. In addition, the kernel canonical correlation analysis (kernel CCA) is not robust to contaminated data. Using a robust kernel mean element and a robust kernel (cross)-covariance operator potentially enables the use of a robust kernel CCA, which is studied in this paper. We first propose an influence function-based estimator for the variance of the KCCU. We then present a non-parametric robust KCCU, which is designed for dealing with contaminated data. The robust KCCU is less sensitive to noise than KCCU. We investigate the proposed method using both synthesized and real data from the Mind Clinical Imaging Consortium (MCIC). We show through simulation studies that the power of the proposed methods is a monotonically increasing function of sample size, and the robust test statistics bring incremental gains in power. To demonstrate the advantage of the robust kernel CCA, we study MCIC data among 22,442 candidate Schizophrenia genes for gene-gene co-Associations. We select 768 genes with strong evidence for shedding light on gene-gene interaction networks for Schizophrenia. By performing gene ontology enrichment analysis, pathway analysis, gene-gene network and other studies, the proposed robust methods can find undiscovered genes in addition to significant gene pairs, and demonstrate superior performance over several of current approaches.

AB - The kernel canonical correlation analysis based U-statistic (KCCU) is being used to detect nonlinear gene-gene co-Associations. Estimating the variance of the KCCU is however computationally intensive. In addition, the kernel canonical correlation analysis (kernel CCA) is not robust to contaminated data. Using a robust kernel mean element and a robust kernel (cross)-covariance operator potentially enables the use of a robust kernel CCA, which is studied in this paper. We first propose an influence function-based estimator for the variance of the KCCU. We then present a non-parametric robust KCCU, which is designed for dealing with contaminated data. The robust KCCU is less sensitive to noise than KCCU. We investigate the proposed method using both synthesized and real data from the Mind Clinical Imaging Consortium (MCIC). We show through simulation studies that the power of the proposed methods is a monotonically increasing function of sample size, and the robust test statistics bring incremental gains in power. To demonstrate the advantage of the robust kernel CCA, we study MCIC data among 22,442 candidate Schizophrenia genes for gene-gene co-Associations. We select 768 genes with strong evidence for shedding light on gene-gene interaction networks for Schizophrenia. By performing gene ontology enrichment analysis, pathway analysis, gene-gene network and other studies, the proposed robust methods can find undiscovered genes in addition to significant gene pairs, and demonstrate superior performance over several of current approaches.

KW - gene-gene co-Association

KW - kernel methods

KW - robust kernel canonical correlation analysis

KW - Robustness

UR - http://www.scopus.com/inward/record.url?scp=85073420660&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073420660&partnerID=8YFLogxK

U2 - 10.1142/S0219720019500288

DO - 10.1142/S0219720019500288

M3 - Article

C2 - 31617462

AN - SCOPUS:85073420660

VL - 17

JO - Journal of Bioinformatics and Computational Biology

JF - Journal of Bioinformatics and Computational Biology

SN - 0219-7200

IS - 4

M1 - 1950028

ER -