Decentralized distribution-sampled classification models with application to brain imaging

Noah Lewis, Harshvardhan Gazula, Sergey M. Plis, Vince D. Calhoun

Research output: Contribution to journalArticle

Abstract

Background: In this age of big data, certain models require very large data stores in order to be informative and accurate. In many cases however, the data are stored in separate locations requiring data transfer between local sites which can cause various practical hurdles, such as privacy concerns or heavy network load. This is especially true for medical imaging data, which can be constrained due to the health insurance portability and accountability act (HIPAA) which provides security protocols for medical data. Medical imaging datasets can also contain many thousands or millions of features, requiring heavy network load. New method: Our research expands upon current decentralized classification research by implementing a new singleshot method for both neural networks and support vector machines. Our approach is to estimate the statistical distribution of the data at each local site and pass this information to the other local sites where each site resamples from the individual distributions and trains a model on both locally available data and the resampled data. The model for each local site produces its own accuracy value which are then averaged together to produce the global average accuracy. Results: We show applications of our approach to handwritten digit classification as well as to multi-subject classification of brain imaging data collected from patients with schizophrenia and healthy controls. Overall, the results showed comparable classification accuracy to the centralized model with lower network load than multishot methods. Comparison with existing methods: Many decentralized classifiers are multishot, requiring heavy network traffic. Our model attempts to alleviate this load while preserving prediction accuracy. Conclusions: We show that our proposed approach performs comparably to a centralized approach while minimizing network traffic compared to multishot methods.

Original languageEnglish (US)
Article number108418
JournalJournal of Neuroscience Methods
Volume329
DOIs
StatePublished - Jan 1 2020
Externally publishedYes

Fingerprint

Neuroimaging
Diagnostic Imaging
Statistical Distributions
Health Insurance Portability and Accountability Act
Privacy
Research
Schizophrenia

Keywords

  • Decentralized learning
  • Deep learning
  • Neuroimaging
  • Statistical inference

ASJC Scopus subject areas

  • Neuroscience(all)

Cite this

Decentralized distribution-sampled classification models with application to brain imaging. / Lewis, Noah; Gazula, Harshvardhan; Plis, Sergey M.; Calhoun, Vince D.

In: Journal of Neuroscience Methods, Vol. 329, 108418, 01.01.2020.

Research output: Contribution to journalArticle

@article{fd802423144d443bb8bcc6f183f9fcce,
title = "Decentralized distribution-sampled classification models with application to brain imaging",
abstract = "Background: In this age of big data, certain models require very large data stores in order to be informative and accurate. In many cases however, the data are stored in separate locations requiring data transfer between local sites which can cause various practical hurdles, such as privacy concerns or heavy network load. This is especially true for medical imaging data, which can be constrained due to the health insurance portability and accountability act (HIPAA) which provides security protocols for medical data. Medical imaging datasets can also contain many thousands or millions of features, requiring heavy network load. New method: Our research expands upon current decentralized classification research by implementing a new singleshot method for both neural networks and support vector machines. Our approach is to estimate the statistical distribution of the data at each local site and pass this information to the other local sites where each site resamples from the individual distributions and trains a model on both locally available data and the resampled data. The model for each local site produces its own accuracy value which are then averaged together to produce the global average accuracy. Results: We show applications of our approach to handwritten digit classification as well as to multi-subject classification of brain imaging data collected from patients with schizophrenia and healthy controls. Overall, the results showed comparable classification accuracy to the centralized model with lower network load than multishot methods. Comparison with existing methods: Many decentralized classifiers are multishot, requiring heavy network traffic. Our model attempts to alleviate this load while preserving prediction accuracy. Conclusions: We show that our proposed approach performs comparably to a centralized approach while minimizing network traffic compared to multishot methods.",
keywords = "Decentralized learning, Deep learning, Neuroimaging, Statistical inference",
author = "Noah Lewis and Harshvardhan Gazula and Plis, {Sergey M.} and Calhoun, {Vince D.}",
year = "2020",
month = "1",
day = "1",
doi = "10.1016/j.jneumeth.2019.108418",
language = "English (US)",
volume = "329",
journal = "Journal of Neuroscience Methods",
issn = "0165-0270",
publisher = "Elsevier",

}

TY - JOUR

T1 - Decentralized distribution-sampled classification models with application to brain imaging

AU - Lewis, Noah

AU - Gazula, Harshvardhan

AU - Plis, Sergey M.

AU - Calhoun, Vince D.

PY - 2020/1/1

Y1 - 2020/1/1

N2 - Background: In this age of big data, certain models require very large data stores in order to be informative and accurate. In many cases however, the data are stored in separate locations requiring data transfer between local sites which can cause various practical hurdles, such as privacy concerns or heavy network load. This is especially true for medical imaging data, which can be constrained due to the health insurance portability and accountability act (HIPAA) which provides security protocols for medical data. Medical imaging datasets can also contain many thousands or millions of features, requiring heavy network load. New method: Our research expands upon current decentralized classification research by implementing a new singleshot method for both neural networks and support vector machines. Our approach is to estimate the statistical distribution of the data at each local site and pass this information to the other local sites where each site resamples from the individual distributions and trains a model on both locally available data and the resampled data. The model for each local site produces its own accuracy value which are then averaged together to produce the global average accuracy. Results: We show applications of our approach to handwritten digit classification as well as to multi-subject classification of brain imaging data collected from patients with schizophrenia and healthy controls. Overall, the results showed comparable classification accuracy to the centralized model with lower network load than multishot methods. Comparison with existing methods: Many decentralized classifiers are multishot, requiring heavy network traffic. Our model attempts to alleviate this load while preserving prediction accuracy. Conclusions: We show that our proposed approach performs comparably to a centralized approach while minimizing network traffic compared to multishot methods.

AB - Background: In this age of big data, certain models require very large data stores in order to be informative and accurate. In many cases however, the data are stored in separate locations requiring data transfer between local sites which can cause various practical hurdles, such as privacy concerns or heavy network load. This is especially true for medical imaging data, which can be constrained due to the health insurance portability and accountability act (HIPAA) which provides security protocols for medical data. Medical imaging datasets can also contain many thousands or millions of features, requiring heavy network load. New method: Our research expands upon current decentralized classification research by implementing a new singleshot method for both neural networks and support vector machines. Our approach is to estimate the statistical distribution of the data at each local site and pass this information to the other local sites where each site resamples from the individual distributions and trains a model on both locally available data and the resampled data. The model for each local site produces its own accuracy value which are then averaged together to produce the global average accuracy. Results: We show applications of our approach to handwritten digit classification as well as to multi-subject classification of brain imaging data collected from patients with schizophrenia and healthy controls. Overall, the results showed comparable classification accuracy to the centralized model with lower network load than multishot methods. Comparison with existing methods: Many decentralized classifiers are multishot, requiring heavy network traffic. Our model attempts to alleviate this load while preserving prediction accuracy. Conclusions: We show that our proposed approach performs comparably to a centralized approach while minimizing network traffic compared to multishot methods.

KW - Decentralized learning

KW - Deep learning

KW - Neuroimaging

KW - Statistical inference

UR - http://www.scopus.com/inward/record.url?scp=85073241506&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85073241506&partnerID=8YFLogxK

U2 - 10.1016/j.jneumeth.2019.108418

DO - 10.1016/j.jneumeth.2019.108418

M3 - Article

C2 - 31630085

AN - SCOPUS:85073241506

VL - 329

JO - Journal of Neuroscience Methods

JF - Journal of Neuroscience Methods

SN - 0165-0270

M1 - 108418

ER -