TY - JOUR
T1 - Robust kernel canonical correlation analysis to detect gene-gene co-Associations
T2 - A case study in genetics
AU - Ashad Alam, M.
AU - Komori, Osamu
AU - Deng, Hong Wen
AU - Calhoun, Vince D.
AU - Wang, Yu Ping
N1 - Funding Information:
The authors wish to thank the National Institutes of Health (R01GM109068, R01MH104680, R01MH107354, R01AR057049, P20 GM109036, U19AG055373, and R01AR069055), and National Science Foundation (1539067) for their support.
Publisher Copyright:
© 2019 World Scientific Publishing Europe Ltd.
PY - 2019/8/1
Y1 - 2019/8/1
N2 - The kernel canonical correlation analysis based U-statistic (KCCU) is being used to detect nonlinear gene-gene co-Associations. Estimating the variance of the KCCU is however computationally intensive. In addition, the kernel canonical correlation analysis (kernel CCA) is not robust to contaminated data. Using a robust kernel mean element and a robust kernel (cross)-covariance operator potentially enables the use of a robust kernel CCA, which is studied in this paper. We first propose an influence function-based estimator for the variance of the KCCU. We then present a non-parametric robust KCCU, which is designed for dealing with contaminated data. The robust KCCU is less sensitive to noise than KCCU. We investigate the proposed method using both synthesized and real data from the Mind Clinical Imaging Consortium (MCIC). We show through simulation studies that the power of the proposed methods is a monotonically increasing function of sample size, and the robust test statistics bring incremental gains in power. To demonstrate the advantage of the robust kernel CCA, we study MCIC data among 22,442 candidate Schizophrenia genes for gene-gene co-Associations. We select 768 genes with strong evidence for shedding light on gene-gene interaction networks for Schizophrenia. By performing gene ontology enrichment analysis, pathway analysis, gene-gene network and other studies, the proposed robust methods can find undiscovered genes in addition to significant gene pairs, and demonstrate superior performance over several of current approaches.
AB - The kernel canonical correlation analysis based U-statistic (KCCU) is being used to detect nonlinear gene-gene co-Associations. Estimating the variance of the KCCU is however computationally intensive. In addition, the kernel canonical correlation analysis (kernel CCA) is not robust to contaminated data. Using a robust kernel mean element and a robust kernel (cross)-covariance operator potentially enables the use of a robust kernel CCA, which is studied in this paper. We first propose an influence function-based estimator for the variance of the KCCU. We then present a non-parametric robust KCCU, which is designed for dealing with contaminated data. The robust KCCU is less sensitive to noise than KCCU. We investigate the proposed method using both synthesized and real data from the Mind Clinical Imaging Consortium (MCIC). We show through simulation studies that the power of the proposed methods is a monotonically increasing function of sample size, and the robust test statistics bring incremental gains in power. To demonstrate the advantage of the robust kernel CCA, we study MCIC data among 22,442 candidate Schizophrenia genes for gene-gene co-Associations. We select 768 genes with strong evidence for shedding light on gene-gene interaction networks for Schizophrenia. By performing gene ontology enrichment analysis, pathway analysis, gene-gene network and other studies, the proposed robust methods can find undiscovered genes in addition to significant gene pairs, and demonstrate superior performance over several of current approaches.
KW - Robustness
KW - gene-gene co-Association
KW - kernel methods
KW - robust kernel canonical correlation analysis
UR - http://www.scopus.com/inward/record.url?scp=85073420660&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073420660&partnerID=8YFLogxK
U2 - 10.1142/S0219720019500288
DO - 10.1142/S0219720019500288
M3 - Article
C2 - 31617462
AN - SCOPUS:85073420660
VL - 17
JO - Journal of Bioinformatics and Computational Biology
JF - Journal of Bioinformatics and Computational Biology
SN - 0219-7200
IS - 4
M1 - 1950028
ER -