Sparse deep neural networks on imaging genetics for schizophrenia case-control classification

Jiayu Chen; Xiang Li; Vince D. Calhoun; Jessica A. Turner; Theo G.M. van Erp; Lei Wang; Ole A. Andreassen; Ingrid Agartz; Lars T. Westlye; Erik Jönsson; Judith M. Ford; Daniel H. Mathalon; Fabio Macciardi; Daniel S. O’Leary; Jingyu Liu; Shihao Ji

doi:10.1101/2020.06.11.20128975

Sparse deep neural networks on imaging genetics for schizophrenia case-control classification

Jiayu Chen, Xiang Li, Vince D. Calhoun, Jessica A. Turner, Theo G.M. van Erp, Lei Wang, Ole A. Andreassen, Ingrid Agartz, Lars T. Westlye, Erik Jönsson, Judith M. Ford, Daniel H. Mathalon, Fabio Macciardi, Daniel S. O’Leary, Jingyu Liu, Shihao Ji

School of Medicine

Research output: Contribution to journal › Article › peer-review

Abstract

Machine learning approaches hold potential for deconstructing complex psychiatric traits and yielding biomarkers which have a large potential for clinical application. Particularly, the advancement in deep learning methods has promoted them as highly promising tools for this purpose due to their capability to handle high-dimensional data and automatically extract high-level latent features. However, current proposed approaches for psychiatric classification or prediction using biological data do not allow direct interpretation of original features, which hinders insights into the biological underpinnings and development of biomarkers. In the present study, we introduce a sparse deep neural network (DNN) approach to identify sparse and interpretable features for schizophrenia (SZ) case-control classification. An L0-norm regularization is implemented on the input layer of the network for sparse feature selection, which can later be interpreted based on importance weights. We applied the proposed approach on a large multi-study cohort (N = 1,684) with brain structural MRI (gray matter volume (GMV)) and genetic (single nucleotide polymorphism (SNP)) data for discrimination of patients with SZ vs. controls. A total of 634 individuals served as training samples, and the resulting classification model was evaluated for generalizability on three independent data sets collected at different sites with different scanning protocols (n = 635, 255 and 160, respectively). We examined the classification power of pure GMV features, as well as combined GMV and SNP features. The performance of the proposed approach was compared with that yielded by an independent component analysis + support vector machine (ICA+SVM) framework. Empirical experiments demonstrated that sparse DNN slightly outperformed ICA+SVM and more effectively fused GMV and SNP features for SZ discrimination. With combined GMV and SNP features, sparse DNN yielded an average classification error rate of 28.98% on external data. The importance weights suggested that the DNN model prioritized to select frontal and superior temporal gyrus for SZ classification when a high sparsity was enforced, and parietal regions were further included with a lower sparsity setting, which strongly echoed previous literature. This is the first attempt to apply an interpretable sparse DNN model to imaging and genetic features for SZ classification with generalizability assessed in a large and multi-study cohort. The results validate the application of the proposed approach to SZ classification, and promise extended utility on other data modalities (e.g. functional and diffusion images) and traits (e.g. continuous scores) which ultimately may result in clinically useful tools.

Original language	English (US)
Journal	Unknown Journal
DOIs	https://doi.org/10.1101/2020.06.11.20128975
State	Published - Jun 12 2020

ASJC Scopus subject areas

General Medicine

Access to Document

10.1101/2020.06.11.20128975

Cite this

Chen, J., Li, X., Calhoun, V. D., Turner, J. A., van Erp, T. G. M., Wang, L., Andreassen, O. A., Agartz, I., Westlye, L. T., Jönsson, E., Ford, J. M., Mathalon, D. H., Macciardi, F., O’Leary, D. S., Liu, J., & Ji, S. (2020). Sparse deep neural networks on imaging genetics for schizophrenia case-control classification. Unknown Journal. https://doi.org/10.1101/2020.06.11.20128975

@article{5de297252e1742a8a958822d1a6f39f6,

title = "Sparse deep neural networks on imaging genetics for schizophrenia case-control classification",

abstract = "Machine learning approaches hold potential for deconstructing complex psychiatric traits and yielding biomarkers which have a large potential for clinical application. Particularly, the advancement in deep learning methods has promoted them as highly promising tools for this purpose due to their capability to handle high-dimensional data and automatically extract high-level latent features. However, current proposed approaches for psychiatric classification or prediction using biological data do not allow direct interpretation of original features, which hinders insights into the biological underpinnings and development of biomarkers. In the present study, we introduce a sparse deep neural network (DNN) approach to identify sparse and interpretable features for schizophrenia (SZ) case-control classification. An L0-norm regularization is implemented on the input layer of the network for sparse feature selection, which can later be interpreted based on importance weights. We applied the proposed approach on a large multi-study cohort (N = 1,684) with brain structural MRI (gray matter volume (GMV)) and genetic (single nucleotide polymorphism (SNP)) data for discrimination of patients with SZ vs. controls. A total of 634 individuals served as training samples, and the resulting classification model was evaluated for generalizability on three independent data sets collected at different sites with different scanning protocols (n = 635, 255 and 160, respectively). We examined the classification power of pure GMV features, as well as combined GMV and SNP features. The performance of the proposed approach was compared with that yielded by an independent component analysis + support vector machine (ICA+SVM) framework. Empirical experiments demonstrated that sparse DNN slightly outperformed ICA+SVM and more effectively fused GMV and SNP features for SZ discrimination. With combined GMV and SNP features, sparse DNN yielded an average classification error rate of 28.98% on external data. The importance weights suggested that the DNN model prioritized to select frontal and superior temporal gyrus for SZ classification when a high sparsity was enforced, and parietal regions were further included with a lower sparsity setting, which strongly echoed previous literature. This is the first attempt to apply an interpretable sparse DNN model to imaging and genetic features for SZ classification with generalizability assessed in a large and multi-study cohort. The results validate the application of the proposed approach to SZ classification, and promise extended utility on other data modalities (e.g. functional and diffusion images) and traits (e.g. continuous scores) which ultimately may result in clinically useful tools.",

author = "Jiayu Chen and Xiang Li and Calhoun, {Vince D.} and Turner, {Jessica A.} and {van Erp}, {Theo G.M.} and Lei Wang and Andreassen, {Ole A.} and Ingrid Agartz and Westlye, {Lars T.} and Erik J{\"o}nsson and Ford, {Judith M.} and Mathalon, {Daniel H.} and Fabio Macciardi and O{\textquoteright}Leary, {Daniel S.} and Jingyu Liu and Shihao Ji",

note = "Publisher Copyright: The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.",

year = "2020",

month = jun,

day = "12",

doi = "10.1101/2020.06.11.20128975",

language = "English (US)",

journal = "Unknown Journal",

issn = "0309-1708",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Sparse deep neural networks on imaging genetics for schizophrenia case-control classification

AU - Chen, Jiayu

AU - Li, Xiang

AU - Calhoun, Vince D.

AU - Turner, Jessica A.

AU - van Erp, Theo G.M.

AU - Wang, Lei

AU - Andreassen, Ole A.

AU - Agartz, Ingrid

AU - Westlye, Lars T.

AU - Jönsson, Erik

AU - Ford, Judith M.

AU - Mathalon, Daniel H.

AU - Macciardi, Fabio

AU - O’Leary, Daniel S.

AU - Liu, Jingyu

AU - Ji, Shihao

N1 - Publisher Copyright: The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.

PY - 2020/6/12

Y1 - 2020/6/12

N2 - Machine learning approaches hold potential for deconstructing complex psychiatric traits and yielding biomarkers which have a large potential for clinical application. Particularly, the advancement in deep learning methods has promoted them as highly promising tools for this purpose due to their capability to handle high-dimensional data and automatically extract high-level latent features. However, current proposed approaches for psychiatric classification or prediction using biological data do not allow direct interpretation of original features, which hinders insights into the biological underpinnings and development of biomarkers. In the present study, we introduce a sparse deep neural network (DNN) approach to identify sparse and interpretable features for schizophrenia (SZ) case-control classification. An L0-norm regularization is implemented on the input layer of the network for sparse feature selection, which can later be interpreted based on importance weights. We applied the proposed approach on a large multi-study cohort (N = 1,684) with brain structural MRI (gray matter volume (GMV)) and genetic (single nucleotide polymorphism (SNP)) data for discrimination of patients with SZ vs. controls. A total of 634 individuals served as training samples, and the resulting classification model was evaluated for generalizability on three independent data sets collected at different sites with different scanning protocols (n = 635, 255 and 160, respectively). We examined the classification power of pure GMV features, as well as combined GMV and SNP features. The performance of the proposed approach was compared with that yielded by an independent component analysis + support vector machine (ICA+SVM) framework. Empirical experiments demonstrated that sparse DNN slightly outperformed ICA+SVM and more effectively fused GMV and SNP features for SZ discrimination. With combined GMV and SNP features, sparse DNN yielded an average classification error rate of 28.98% on external data. The importance weights suggested that the DNN model prioritized to select frontal and superior temporal gyrus for SZ classification when a high sparsity was enforced, and parietal regions were further included with a lower sparsity setting, which strongly echoed previous literature. This is the first attempt to apply an interpretable sparse DNN model to imaging and genetic features for SZ classification with generalizability assessed in a large and multi-study cohort. The results validate the application of the proposed approach to SZ classification, and promise extended utility on other data modalities (e.g. functional and diffusion images) and traits (e.g. continuous scores) which ultimately may result in clinically useful tools.

AB - Machine learning approaches hold potential for deconstructing complex psychiatric traits and yielding biomarkers which have a large potential for clinical application. Particularly, the advancement in deep learning methods has promoted them as highly promising tools for this purpose due to their capability to handle high-dimensional data and automatically extract high-level latent features. However, current proposed approaches for psychiatric classification or prediction using biological data do not allow direct interpretation of original features, which hinders insights into the biological underpinnings and development of biomarkers. In the present study, we introduce a sparse deep neural network (DNN) approach to identify sparse and interpretable features for schizophrenia (SZ) case-control classification. An L0-norm regularization is implemented on the input layer of the network for sparse feature selection, which can later be interpreted based on importance weights. We applied the proposed approach on a large multi-study cohort (N = 1,684) with brain structural MRI (gray matter volume (GMV)) and genetic (single nucleotide polymorphism (SNP)) data for discrimination of patients with SZ vs. controls. A total of 634 individuals served as training samples, and the resulting classification model was evaluated for generalizability on three independent data sets collected at different sites with different scanning protocols (n = 635, 255 and 160, respectively). We examined the classification power of pure GMV features, as well as combined GMV and SNP features. The performance of the proposed approach was compared with that yielded by an independent component analysis + support vector machine (ICA+SVM) framework. Empirical experiments demonstrated that sparse DNN slightly outperformed ICA+SVM and more effectively fused GMV and SNP features for SZ discrimination. With combined GMV and SNP features, sparse DNN yielded an average classification error rate of 28.98% on external data. The importance weights suggested that the DNN model prioritized to select frontal and superior temporal gyrus for SZ classification when a high sparsity was enforced, and parietal regions were further included with a lower sparsity setting, which strongly echoed previous literature. This is the first attempt to apply an interpretable sparse DNN model to imaging and genetic features for SZ classification with generalizability assessed in a large and multi-study cohort. The results validate the application of the proposed approach to SZ classification, and promise extended utility on other data modalities (e.g. functional and diffusion images) and traits (e.g. continuous scores) which ultimately may result in clinically useful tools.

UR - http://www.scopus.com/inward/record.url?scp=85099546290&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85099546290&partnerID=8YFLogxK

U2 - 10.1101/2020.06.11.20128975

DO - 10.1101/2020.06.11.20128975

M3 - Article

AN - SCOPUS:85099546290

SN - 0309-1708

JO - Unknown Journal

JF - Unknown Journal

ER -

Sparse deep neural networks on imaging genetics for schizophrenia case-control classification

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this