Sparse representation based biomarker selection for schizophrenia with integrated analysis of fMRI and SNPs

Hongbao Cao, Junbo Duan, Dongdong Lin, Yin Yao Shugart, Vince Daniel Calhoun, Yu Ping Wang

Research output: Contribution to journalArticle

Abstract

Integrative analysis of multiple data types can take advantage of their complementary information and therefore may provide higher power to identify potential biomarkers that would be missed using individual data analysis. Due to different natures of diverse data modality, data integration is challenging. Here we address the data integration problem by developing a generalized sparse model (GSM) using weighting factors to integrate multi-modality data for biomarker selection. As an example, we applied the GSM model to a joint analysis of two types of schizophrenia data sets: 759,075 SNPs and 153,594 functional magnetic resonance imaging (fMRI) voxels in 208 subjects (92 cases/116 controls). To solve this small-sample-large-variable problem, we developed a novel sparse representation based variable selection (SRVS) algorithm, with the primary aim to identify biomarkers associated with schizophrenia. To validate the effectiveness of the selected variables, we performed multivariate classification followed by a ten-fold cross validation. We compared our proposed SRVS algorithm with an earlier sparse model based variable selection algorithm for integrated analysis. In addition, we compared with the traditional statistics method for uni-variant data analysis (Chi-squared test for SNP data and ANOVA for fMRI data). Results showed that our proposed SRVS method can identify novel biomarkers that show stronger capability in distinguishing schizophrenia patients from healthy controls. Moreover, better classification ratios were achieved using biomarkers from both types of data, suggesting the importance of integrative analysis.

Original languageEnglish (US)
Pages (from-to)220-228
Number of pages9
JournalNeuroImage
Volume102
Issue numberP1
DOIs
StatePublished - Nov 5 2014
Externally publishedYes

Fingerprint

Single Nucleotide Polymorphism
Schizophrenia
Biomarkers
Magnetic Resonance Imaging
Analysis of Variance

Keywords

  • FMRI
  • Schizophrenia
  • SNP
  • Sparse representations
  • Variable selection

ASJC Scopus subject areas

  • Cognitive Neuroscience
  • Neurology

Cite this

Sparse representation based biomarker selection for schizophrenia with integrated analysis of fMRI and SNPs. / Cao, Hongbao; Duan, Junbo; Lin, Dongdong; Shugart, Yin Yao; Calhoun, Vince Daniel; Wang, Yu Ping.

In: NeuroImage, Vol. 102, No. P1, 05.11.2014, p. 220-228.

Research output: Contribution to journalArticle

Cao, Hongbao ; Duan, Junbo ; Lin, Dongdong ; Shugart, Yin Yao ; Calhoun, Vince Daniel ; Wang, Yu Ping. / Sparse representation based biomarker selection for schizophrenia with integrated analysis of fMRI and SNPs. In: NeuroImage. 2014 ; Vol. 102, No. P1. pp. 220-228.
@article{3b39162298754473a2eeaf46abfe01bd,
title = "Sparse representation based biomarker selection for schizophrenia with integrated analysis of fMRI and SNPs",
abstract = "Integrative analysis of multiple data types can take advantage of their complementary information and therefore may provide higher power to identify potential biomarkers that would be missed using individual data analysis. Due to different natures of diverse data modality, data integration is challenging. Here we address the data integration problem by developing a generalized sparse model (GSM) using weighting factors to integrate multi-modality data for biomarker selection. As an example, we applied the GSM model to a joint analysis of two types of schizophrenia data sets: 759,075 SNPs and 153,594 functional magnetic resonance imaging (fMRI) voxels in 208 subjects (92 cases/116 controls). To solve this small-sample-large-variable problem, we developed a novel sparse representation based variable selection (SRVS) algorithm, with the primary aim to identify biomarkers associated with schizophrenia. To validate the effectiveness of the selected variables, we performed multivariate classification followed by a ten-fold cross validation. We compared our proposed SRVS algorithm with an earlier sparse model based variable selection algorithm for integrated analysis. In addition, we compared with the traditional statistics method for uni-variant data analysis (Chi-squared test for SNP data and ANOVA for fMRI data). Results showed that our proposed SRVS method can identify novel biomarkers that show stronger capability in distinguishing schizophrenia patients from healthy controls. Moreover, better classification ratios were achieved using biomarkers from both types of data, suggesting the importance of integrative analysis.",
keywords = "FMRI, Schizophrenia, SNP, Sparse representations, Variable selection",
author = "Hongbao Cao and Junbo Duan and Dongdong Lin and Shugart, {Yin Yao} and Calhoun, {Vince Daniel} and Wang, {Yu Ping}",
year = "2014",
month = "11",
day = "5",
doi = "10.1016/j.neuroimage.2014.01.021",
language = "English (US)",
volume = "102",
pages = "220--228",
journal = "NeuroImage",
issn = "1053-8119",
publisher = "Academic Press Inc.",
number = "P1",

}

TY - JOUR

T1 - Sparse representation based biomarker selection for schizophrenia with integrated analysis of fMRI and SNPs

AU - Cao, Hongbao

AU - Duan, Junbo

AU - Lin, Dongdong

AU - Shugart, Yin Yao

AU - Calhoun, Vince Daniel

AU - Wang, Yu Ping

PY - 2014/11/5

Y1 - 2014/11/5

N2 - Integrative analysis of multiple data types can take advantage of their complementary information and therefore may provide higher power to identify potential biomarkers that would be missed using individual data analysis. Due to different natures of diverse data modality, data integration is challenging. Here we address the data integration problem by developing a generalized sparse model (GSM) using weighting factors to integrate multi-modality data for biomarker selection. As an example, we applied the GSM model to a joint analysis of two types of schizophrenia data sets: 759,075 SNPs and 153,594 functional magnetic resonance imaging (fMRI) voxels in 208 subjects (92 cases/116 controls). To solve this small-sample-large-variable problem, we developed a novel sparse representation based variable selection (SRVS) algorithm, with the primary aim to identify biomarkers associated with schizophrenia. To validate the effectiveness of the selected variables, we performed multivariate classification followed by a ten-fold cross validation. We compared our proposed SRVS algorithm with an earlier sparse model based variable selection algorithm for integrated analysis. In addition, we compared with the traditional statistics method for uni-variant data analysis (Chi-squared test for SNP data and ANOVA for fMRI data). Results showed that our proposed SRVS method can identify novel biomarkers that show stronger capability in distinguishing schizophrenia patients from healthy controls. Moreover, better classification ratios were achieved using biomarkers from both types of data, suggesting the importance of integrative analysis.

AB - Integrative analysis of multiple data types can take advantage of their complementary information and therefore may provide higher power to identify potential biomarkers that would be missed using individual data analysis. Due to different natures of diverse data modality, data integration is challenging. Here we address the data integration problem by developing a generalized sparse model (GSM) using weighting factors to integrate multi-modality data for biomarker selection. As an example, we applied the GSM model to a joint analysis of two types of schizophrenia data sets: 759,075 SNPs and 153,594 functional magnetic resonance imaging (fMRI) voxels in 208 subjects (92 cases/116 controls). To solve this small-sample-large-variable problem, we developed a novel sparse representation based variable selection (SRVS) algorithm, with the primary aim to identify biomarkers associated with schizophrenia. To validate the effectiveness of the selected variables, we performed multivariate classification followed by a ten-fold cross validation. We compared our proposed SRVS algorithm with an earlier sparse model based variable selection algorithm for integrated analysis. In addition, we compared with the traditional statistics method for uni-variant data analysis (Chi-squared test for SNP data and ANOVA for fMRI data). Results showed that our proposed SRVS method can identify novel biomarkers that show stronger capability in distinguishing schizophrenia patients from healthy controls. Moreover, better classification ratios were achieved using biomarkers from both types of data, suggesting the importance of integrative analysis.

KW - FMRI

KW - Schizophrenia

KW - SNP

KW - Sparse representations

KW - Variable selection

UR - http://www.scopus.com/inward/record.url?scp=84908367503&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84908367503&partnerID=8YFLogxK

U2 - 10.1016/j.neuroimage.2014.01.021

DO - 10.1016/j.neuroimage.2014.01.021

M3 - Article

C2 - 24530838

AN - SCOPUS:84908367503

VL - 102

SP - 220

EP - 228

JO - NeuroImage

JF - NeuroImage

SN - 1053-8119

IS - P1

ER -