Optimized Combination of Multiple Graphs with Application to the Integration of Brain Imaging and (epi)Genomics Data

Yuntong Bai; Zille Pascal; Vince Calhoun; Yu Ping Wang

doi:10.1109/TMI.2019.2958256

Optimized Combination of Multiple Graphs with Application to the Integration of Brain Imaging and (epi)Genomics Data

Yuntong Bai, Zille Pascal, Vince Calhoun, Yu Ping Wang

School of Medicine

Research output: Contribution to journal › Article › peer-review

1 Scopus citations

Abstract

With the rapid development of high-throughput technologies, a growing amount of multi-omics data are collected, giving rise to a great demand for combining such data for biomedical discovery. Due to the cost and time to label the data manually, the number of labelled samples is limited. This motivated the need for semi-supervised learning algorithms. In this work, we applied a graph-based semi-supervised learning (GSSL) to classify a severe chronic mental disorder, schizophrenia (SZ). An advantage of GSSL is that it can simultaneously analyse more than two types of data, while many existing models focus on pairwise data analysis. In particular, we applied GSSL to the analysis of single nucleotide polymorphism (SNP), functional magnetic resonance imaging (fMRI) and DNA methylation data, which accounts for genetics, brain imaging (endophenotypes), and environmental factors (epigenomics) respectively. While parameter selection has been an open challenge for most models, another key contribution of this work is that we explored the parameter space to interpret their meaning and established practical guidelines. Based on the practical significance of each hyper-parameter, a relatively small range of candidate values can be determined in a data-driven way to both optimize and speed up the parameter tuning process. We validated the model through both synthetic data and a real SZ dataset of 184 subjects from the Mental Illness and Neuroscience Discovery (MIND) Clinical Imaging Consortium. In comparison to several existing approaches, our algorithm achieved better performance in terms of classification accuracy. We also confirmed the significance of several brain regions associated with SZ.

Original language	English (US)
Article number	8926394
Pages (from-to)	1801-1811
Number of pages	11
Journal	IEEE transactions on medical imaging
Volume	39
Issue number	6
DOIs	https://doi.org/10.1109/TMI.2019.2958256
State	Published - Jun 2020

Keywords

Multi-view learning
graph-based analysis
parameter selection
schizophrenia

ASJC Scopus subject areas

Software
Radiological and Ultrasound Technology
Computer Science Applications
Electrical and Electronic Engineering

Access to Document

10.1109/TMI.2019.2958256

Cite this

@article{85cca558a42b40bf989eedda3b4a378f,

title = "Optimized Combination of Multiple Graphs with Application to the Integration of Brain Imaging and (epi)Genomics Data",

abstract = "With the rapid development of high-throughput technologies, a growing amount of multi-omics data are collected, giving rise to a great demand for combining such data for biomedical discovery. Due to the cost and time to label the data manually, the number of labelled samples is limited. This motivated the need for semi-supervised learning algorithms. In this work, we applied a graph-based semi-supervised learning (GSSL) to classify a severe chronic mental disorder, schizophrenia (SZ). An advantage of GSSL is that it can simultaneously analyse more than two types of data, while many existing models focus on pairwise data analysis. In particular, we applied GSSL to the analysis of single nucleotide polymorphism (SNP), functional magnetic resonance imaging (fMRI) and DNA methylation data, which accounts for genetics, brain imaging (endophenotypes), and environmental factors (epigenomics) respectively. While parameter selection has been an open challenge for most models, another key contribution of this work is that we explored the parameter space to interpret their meaning and established practical guidelines. Based on the practical significance of each hyper-parameter, a relatively small range of candidate values can be determined in a data-driven way to both optimize and speed up the parameter tuning process. We validated the model through both synthetic data and a real SZ dataset of 184 subjects from the Mental Illness and Neuroscience Discovery (MIND) Clinical Imaging Consortium. In comparison to several existing approaches, our algorithm achieved better performance in terms of classification accuracy. We also confirmed the significance of several brain regions associated with SZ.",

keywords = "Multi-view learning, graph-based analysis, parameter selection, schizophrenia",

author = "Yuntong Bai and Zille Pascal and Vince Calhoun and Wang, {Yu Ping}",

note = "Funding Information: Manuscript received September 5, 2019; revised November 26, 2019; accepted December 3, 2019. Date of publication December 6, 2019; date of current version June 1, 2020. This work was supported in part by NIH under Grant P20GM103472, Grant R01EB005846, Grant R01GM109068, Grant R01MH104680, Grant R01MH107354, and Grant R01MH094524 and in part by NSF under Grant #1539067. (Corresponding author: Yuntong Bai.) Y. Bai, Z. Pascal, and Y.-P. Wang are with the Biomedical Engineering Department, Tulane University, New Orleans, LA 70118 USA (e-mail: wyp@tulane.edu; ybai1@tulane.edu). Publisher Copyright: {\textcopyright} 1982-2012 IEEE.",

year = "2020",

month = jun,

doi = "10.1109/TMI.2019.2958256",

language = "English (US)",

volume = "39",

pages = "1801--1811",

journal = "IEEE transactions on medical imaging",

issn = "0278-0062",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "6",

}

TY - JOUR

T1 - Optimized Combination of Multiple Graphs with Application to the Integration of Brain Imaging and (epi)Genomics Data

AU - Bai, Yuntong

AU - Pascal, Zille

AU - Calhoun, Vince

AU - Wang, Yu Ping

N1 - Funding Information: Manuscript received September 5, 2019; revised November 26, 2019; accepted December 3, 2019. Date of publication December 6, 2019; date of current version June 1, 2020. This work was supported in part by NIH under Grant P20GM103472, Grant R01EB005846, Grant R01GM109068, Grant R01MH104680, Grant R01MH107354, and Grant R01MH094524 and in part by NSF under Grant #1539067. (Corresponding author: Yuntong Bai.) Y. Bai, Z. Pascal, and Y.-P. Wang are with the Biomedical Engineering Department, Tulane University, New Orleans, LA 70118 USA (e-mail: wyp@tulane.edu; ybai1@tulane.edu). Publisher Copyright: © 1982-2012 IEEE.

PY - 2020/6

Y1 - 2020/6

N2 - With the rapid development of high-throughput technologies, a growing amount of multi-omics data are collected, giving rise to a great demand for combining such data for biomedical discovery. Due to the cost and time to label the data manually, the number of labelled samples is limited. This motivated the need for semi-supervised learning algorithms. In this work, we applied a graph-based semi-supervised learning (GSSL) to classify a severe chronic mental disorder, schizophrenia (SZ). An advantage of GSSL is that it can simultaneously analyse more than two types of data, while many existing models focus on pairwise data analysis. In particular, we applied GSSL to the analysis of single nucleotide polymorphism (SNP), functional magnetic resonance imaging (fMRI) and DNA methylation data, which accounts for genetics, brain imaging (endophenotypes), and environmental factors (epigenomics) respectively. While parameter selection has been an open challenge for most models, another key contribution of this work is that we explored the parameter space to interpret their meaning and established practical guidelines. Based on the practical significance of each hyper-parameter, a relatively small range of candidate values can be determined in a data-driven way to both optimize and speed up the parameter tuning process. We validated the model through both synthetic data and a real SZ dataset of 184 subjects from the Mental Illness and Neuroscience Discovery (MIND) Clinical Imaging Consortium. In comparison to several existing approaches, our algorithm achieved better performance in terms of classification accuracy. We also confirmed the significance of several brain regions associated with SZ.

AB - With the rapid development of high-throughput technologies, a growing amount of multi-omics data are collected, giving rise to a great demand for combining such data for biomedical discovery. Due to the cost and time to label the data manually, the number of labelled samples is limited. This motivated the need for semi-supervised learning algorithms. In this work, we applied a graph-based semi-supervised learning (GSSL) to classify a severe chronic mental disorder, schizophrenia (SZ). An advantage of GSSL is that it can simultaneously analyse more than two types of data, while many existing models focus on pairwise data analysis. In particular, we applied GSSL to the analysis of single nucleotide polymorphism (SNP), functional magnetic resonance imaging (fMRI) and DNA methylation data, which accounts for genetics, brain imaging (endophenotypes), and environmental factors (epigenomics) respectively. While parameter selection has been an open challenge for most models, another key contribution of this work is that we explored the parameter space to interpret their meaning and established practical guidelines. Based on the practical significance of each hyper-parameter, a relatively small range of candidate values can be determined in a data-driven way to both optimize and speed up the parameter tuning process. We validated the model through both synthetic data and a real SZ dataset of 184 subjects from the Mental Illness and Neuroscience Discovery (MIND) Clinical Imaging Consortium. In comparison to several existing approaches, our algorithm achieved better performance in terms of classification accuracy. We also confirmed the significance of several brain regions associated with SZ.

KW - Multi-view learning

KW - graph-based analysis

KW - parameter selection

KW - schizophrenia

UR - http://www.scopus.com/inward/record.url?scp=85085905188&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85085905188&partnerID=8YFLogxK

U2 - 10.1109/TMI.2019.2958256

DO - 10.1109/TMI.2019.2958256

M3 - Article

C2 - 31825864

AN - SCOPUS:85085905188

SN - 0278-0062

VL - 39

SP - 1801

EP - 1811

JO - IEEE transactions on medical imaging

JF - IEEE transactions on medical imaging

IS - 6

M1 - 8926394

ER -

Optimized Combination of Multiple Graphs with Application to the Integration of Brain Imaging and (epi)Genomics Data

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this