Adaptive sparse multiple canonical correlation analysis with application to imaging (epi)genomics study of schizophrenia

Wenxing Hu, Dongdong Lin, Shaolong Cao, Jing Yu Liu, Jiayu Chen, Vince Daniel Calhoun, Yuping Wang

Research output: Contribution to journalArticle

Abstract

Finding correlations across multiple data sets in imaging and (epi)genomics is a common challenge. Sparse multiple canonical correlation analysis (SMCCA) is a multivariate model widely used to extract contributing features from each data while maximizing the cross-modality correlation. The model is achieved by using the combination of pairwise covariances between any two data sets. However, the scales of different pairwise covariances could be quite different and the direct combination of pairwise covariances in SMCCA is unfair. The problem of “unfair combination of pairwise covariances” restricts the power of SMCCA for feature selection. In this paper, we propose a novel formulation of SMCCA, called adaptive SMCCA, to overcome the problem by introducing adaptive weights when combining pairwise covariances. Both simulation and real data analysis show the outperformance of adaptive SMCCA in terms of feature selection over conventional SMCCA and SMCCA with fixed weights. Large-scale numerical experiments show that adaptive SMCCA converges as fast as conventional SMCCA. When applying it to imaging (epi)genetics study of schizophrenia subjects, we can detect significant (epi)genetic variants and brain regions, which are consistent with other existing reports. In addition, several significant brain-development related pathways, e.g., neural tube development, are detected by our model, demonstrating imaging epigenetic association may be overlooked by conventional SMCCA. All these results demonstrate that adaptive SMCCA are well-suited for detecting three-way or multi-way correlations and thus can find widespread applications in multiple omics and imaging data integration.

Original languageEnglish (US)
JournalIEEE Transactions on Biomedical Engineering
DOIs
StateAccepted/In press - Nov 7 2017
Externally publishedYes

Fingerprint

Imaging techniques
Feature extraction
Brain
Data integration
Genomics
Experiments
Genetics

Keywords

  • Adaptation models
  • Biomarkers
  • canonical correlation analysis
  • Correlation
  • data fusion
  • Diseases
  • Feature extraction
  • genomic data analysis
  • Imaging
  • imaging genomics
  • Linear programming
  • multi-omics data integration

ASJC Scopus subject areas

  • Biomedical Engineering

Cite this

Adaptive sparse multiple canonical correlation analysis with application to imaging (epi)genomics study of schizophrenia. / Hu, Wenxing; Lin, Dongdong; Cao, Shaolong; Liu, Jing Yu; Chen, Jiayu; Calhoun, Vince Daniel; Wang, Yuping.

In: IEEE Transactions on Biomedical Engineering, 07.11.2017.

Research output: Contribution to journalArticle

@article{c20c8826dc254449aae689dbaf712280,
title = "Adaptive sparse multiple canonical correlation analysis with application to imaging (epi)genomics study of schizophrenia",
abstract = "Finding correlations across multiple data sets in imaging and (epi)genomics is a common challenge. Sparse multiple canonical correlation analysis (SMCCA) is a multivariate model widely used to extract contributing features from each data while maximizing the cross-modality correlation. The model is achieved by using the combination of pairwise covariances between any two data sets. However, the scales of different pairwise covariances could be quite different and the direct combination of pairwise covariances in SMCCA is unfair. The problem of “unfair combination of pairwise covariances” restricts the power of SMCCA for feature selection. In this paper, we propose a novel formulation of SMCCA, called adaptive SMCCA, to overcome the problem by introducing adaptive weights when combining pairwise covariances. Both simulation and real data analysis show the outperformance of adaptive SMCCA in terms of feature selection over conventional SMCCA and SMCCA with fixed weights. Large-scale numerical experiments show that adaptive SMCCA converges as fast as conventional SMCCA. When applying it to imaging (epi)genetics study of schizophrenia subjects, we can detect significant (epi)genetic variants and brain regions, which are consistent with other existing reports. In addition, several significant brain-development related pathways, e.g., neural tube development, are detected by our model, demonstrating imaging epigenetic association may be overlooked by conventional SMCCA. All these results demonstrate that adaptive SMCCA are well-suited for detecting three-way or multi-way correlations and thus can find widespread applications in multiple omics and imaging data integration.",
keywords = "Adaptation models, Biomarkers, canonical correlation analysis, Correlation, data fusion, Diseases, Feature extraction, genomic data analysis, Imaging, imaging genomics, Linear programming, multi-omics data integration",
author = "Wenxing Hu and Dongdong Lin and Shaolong Cao and Liu, {Jing Yu} and Jiayu Chen and Calhoun, {Vince Daniel} and Yuping Wang",
year = "2017",
month = "11",
day = "7",
doi = "10.1109/TBME.2017.2771483",
language = "English (US)",
journal = "IEEE Transactions on Biomedical Engineering",
issn = "0018-9294",
publisher = "IEEE Computer Society",

}

TY - JOUR

T1 - Adaptive sparse multiple canonical correlation analysis with application to imaging (epi)genomics study of schizophrenia

AU - Hu, Wenxing

AU - Lin, Dongdong

AU - Cao, Shaolong

AU - Liu, Jing Yu

AU - Chen, Jiayu

AU - Calhoun, Vince Daniel

AU - Wang, Yuping

PY - 2017/11/7

Y1 - 2017/11/7

N2 - Finding correlations across multiple data sets in imaging and (epi)genomics is a common challenge. Sparse multiple canonical correlation analysis (SMCCA) is a multivariate model widely used to extract contributing features from each data while maximizing the cross-modality correlation. The model is achieved by using the combination of pairwise covariances between any two data sets. However, the scales of different pairwise covariances could be quite different and the direct combination of pairwise covariances in SMCCA is unfair. The problem of “unfair combination of pairwise covariances” restricts the power of SMCCA for feature selection. In this paper, we propose a novel formulation of SMCCA, called adaptive SMCCA, to overcome the problem by introducing adaptive weights when combining pairwise covariances. Both simulation and real data analysis show the outperformance of adaptive SMCCA in terms of feature selection over conventional SMCCA and SMCCA with fixed weights. Large-scale numerical experiments show that adaptive SMCCA converges as fast as conventional SMCCA. When applying it to imaging (epi)genetics study of schizophrenia subjects, we can detect significant (epi)genetic variants and brain regions, which are consistent with other existing reports. In addition, several significant brain-development related pathways, e.g., neural tube development, are detected by our model, demonstrating imaging epigenetic association may be overlooked by conventional SMCCA. All these results demonstrate that adaptive SMCCA are well-suited for detecting three-way or multi-way correlations and thus can find widespread applications in multiple omics and imaging data integration.

AB - Finding correlations across multiple data sets in imaging and (epi)genomics is a common challenge. Sparse multiple canonical correlation analysis (SMCCA) is a multivariate model widely used to extract contributing features from each data while maximizing the cross-modality correlation. The model is achieved by using the combination of pairwise covariances between any two data sets. However, the scales of different pairwise covariances could be quite different and the direct combination of pairwise covariances in SMCCA is unfair. The problem of “unfair combination of pairwise covariances” restricts the power of SMCCA for feature selection. In this paper, we propose a novel formulation of SMCCA, called adaptive SMCCA, to overcome the problem by introducing adaptive weights when combining pairwise covariances. Both simulation and real data analysis show the outperformance of adaptive SMCCA in terms of feature selection over conventional SMCCA and SMCCA with fixed weights. Large-scale numerical experiments show that adaptive SMCCA converges as fast as conventional SMCCA. When applying it to imaging (epi)genetics study of schizophrenia subjects, we can detect significant (epi)genetic variants and brain regions, which are consistent with other existing reports. In addition, several significant brain-development related pathways, e.g., neural tube development, are detected by our model, demonstrating imaging epigenetic association may be overlooked by conventional SMCCA. All these results demonstrate that adaptive SMCCA are well-suited for detecting three-way or multi-way correlations and thus can find widespread applications in multiple omics and imaging data integration.

KW - Adaptation models

KW - Biomarkers

KW - canonical correlation analysis

KW - Correlation

KW - data fusion

KW - Diseases

KW - Feature extraction

KW - genomic data analysis

KW - Imaging

KW - imaging genomics

KW - Linear programming

KW - multi-omics data integration

UR - http://www.scopus.com/inward/record.url?scp=85033722402&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85033722402&partnerID=8YFLogxK

U2 - 10.1109/TBME.2017.2771483

DO - 10.1109/TBME.2017.2771483

M3 - Article

C2 - 29364120

AN - SCOPUS:85033722402

JO - IEEE Transactions on Biomedical Engineering

JF - IEEE Transactions on Biomedical Engineering

SN - 0018-9294

ER -