See without looking: Joint Visualization of sensitive multi-site datasets

Debbrata K. Saha, Vince Daniel Calhoun, Sandeep R. Panta, Sergey M. Plis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Visualization of high dimensional large-scale datasets via an embedding into a 2D map is a powerful exploration tool for assessing latent structure in the data and detecting outliers. There are many methods developed for this task but most assume that all pairs of samples are available for common computation. Specifically, the distances between all pairs of points need to be directly computable. In contrast, we work with sensitive neuroimaging data, when local sites cannot share their samples and the distances cannot be easily computed across the sites. Yet, the desire is to let all the local data participate in collaborative computation without leaving their respective sites. In this scenario, a quality control tool that visualizes decentralized dataset in its entirety via global aggregation of local computations is especially important as it would allow screening of samples that cannot be evaluated otherwise. This paper introduces an algorithm to solve this problem: decentralized data stochastic neighbor embedding (dSNE). Based on the MNIST dataset we introduce metrics for measuring the embedding quality and use them to compare dSNE to its centralized counterpart. We also apply dSNE to a multi-site neuroimaging dataset with encouraging results.

Original languageEnglish (US)
Title of host publication26th International Joint Conference on Artificial Intelligence, IJCAI 2017
PublisherInternational Joint Conferences on Artificial Intelligence
Pages2672-2678
Number of pages7
ISBN (Electronic)9780999241103
StatePublished - 2017
Externally publishedYes
Event26th International Joint Conference on Artificial Intelligence, IJCAI 2017 - Melbourne, Australia
Duration: Aug 19 2017Aug 25 2017

Other

Other26th International Joint Conference on Artificial Intelligence, IJCAI 2017
CountryAustralia
CityMelbourne
Period8/19/178/25/17

Fingerprint

Neuroimaging
Visualization
Quality control
Screening
Agglomeration

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Saha, D. K., Calhoun, V. D., Panta, S. R., & Plis, S. M. (2017). See without looking: Joint Visualization of sensitive multi-site datasets. In 26th International Joint Conference on Artificial Intelligence, IJCAI 2017 (pp. 2672-2678). International Joint Conferences on Artificial Intelligence.

See without looking : Joint Visualization of sensitive multi-site datasets. / Saha, Debbrata K.; Calhoun, Vince Daniel; Panta, Sandeep R.; Plis, Sergey M.

26th International Joint Conference on Artificial Intelligence, IJCAI 2017. International Joint Conferences on Artificial Intelligence, 2017. p. 2672-2678.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Saha, DK, Calhoun, VD, Panta, SR & Plis, SM 2017, See without looking: Joint Visualization of sensitive multi-site datasets. in 26th International Joint Conference on Artificial Intelligence, IJCAI 2017. International Joint Conferences on Artificial Intelligence, pp. 2672-2678, 26th International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, 8/19/17.
Saha DK, Calhoun VD, Panta SR, Plis SM. See without looking: Joint Visualization of sensitive multi-site datasets. In 26th International Joint Conference on Artificial Intelligence, IJCAI 2017. International Joint Conferences on Artificial Intelligence. 2017. p. 2672-2678
Saha, Debbrata K. ; Calhoun, Vince Daniel ; Panta, Sandeep R. ; Plis, Sergey M. / See without looking : Joint Visualization of sensitive multi-site datasets. 26th International Joint Conference on Artificial Intelligence, IJCAI 2017. International Joint Conferences on Artificial Intelligence, 2017. pp. 2672-2678
@inproceedings{663bc728d1574dcea33bdc1f84407954,
title = "See without looking: Joint Visualization of sensitive multi-site datasets",
abstract = "Visualization of high dimensional large-scale datasets via an embedding into a 2D map is a powerful exploration tool for assessing latent structure in the data and detecting outliers. There are many methods developed for this task but most assume that all pairs of samples are available for common computation. Specifically, the distances between all pairs of points need to be directly computable. In contrast, we work with sensitive neuroimaging data, when local sites cannot share their samples and the distances cannot be easily computed across the sites. Yet, the desire is to let all the local data participate in collaborative computation without leaving their respective sites. In this scenario, a quality control tool that visualizes decentralized dataset in its entirety via global aggregation of local computations is especially important as it would allow screening of samples that cannot be evaluated otherwise. This paper introduces an algorithm to solve this problem: decentralized data stochastic neighbor embedding (dSNE). Based on the MNIST dataset we introduce metrics for measuring the embedding quality and use them to compare dSNE to its centralized counterpart. We also apply dSNE to a multi-site neuroimaging dataset with encouraging results.",
author = "Saha, {Debbrata K.} and Calhoun, {Vince Daniel} and Panta, {Sandeep R.} and Plis, {Sergey M.}",
year = "2017",
language = "English (US)",
pages = "2672--2678",
booktitle = "26th International Joint Conference on Artificial Intelligence, IJCAI 2017",
publisher = "International Joint Conferences on Artificial Intelligence",

}

TY - GEN

T1 - See without looking

T2 - Joint Visualization of sensitive multi-site datasets

AU - Saha, Debbrata K.

AU - Calhoun, Vince Daniel

AU - Panta, Sandeep R.

AU - Plis, Sergey M.

PY - 2017

Y1 - 2017

N2 - Visualization of high dimensional large-scale datasets via an embedding into a 2D map is a powerful exploration tool for assessing latent structure in the data and detecting outliers. There are many methods developed for this task but most assume that all pairs of samples are available for common computation. Specifically, the distances between all pairs of points need to be directly computable. In contrast, we work with sensitive neuroimaging data, when local sites cannot share their samples and the distances cannot be easily computed across the sites. Yet, the desire is to let all the local data participate in collaborative computation without leaving their respective sites. In this scenario, a quality control tool that visualizes decentralized dataset in its entirety via global aggregation of local computations is especially important as it would allow screening of samples that cannot be evaluated otherwise. This paper introduces an algorithm to solve this problem: decentralized data stochastic neighbor embedding (dSNE). Based on the MNIST dataset we introduce metrics for measuring the embedding quality and use them to compare dSNE to its centralized counterpart. We also apply dSNE to a multi-site neuroimaging dataset with encouraging results.

AB - Visualization of high dimensional large-scale datasets via an embedding into a 2D map is a powerful exploration tool for assessing latent structure in the data and detecting outliers. There are many methods developed for this task but most assume that all pairs of samples are available for common computation. Specifically, the distances between all pairs of points need to be directly computable. In contrast, we work with sensitive neuroimaging data, when local sites cannot share their samples and the distances cannot be easily computed across the sites. Yet, the desire is to let all the local data participate in collaborative computation without leaving their respective sites. In this scenario, a quality control tool that visualizes decentralized dataset in its entirety via global aggregation of local computations is especially important as it would allow screening of samples that cannot be evaluated otherwise. This paper introduces an algorithm to solve this problem: decentralized data stochastic neighbor embedding (dSNE). Based on the MNIST dataset we introduce metrics for measuring the embedding quality and use them to compare dSNE to its centralized counterpart. We also apply dSNE to a multi-site neuroimaging dataset with encouraging results.

UR - http://www.scopus.com/inward/record.url?scp=85031946035&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85031946035&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85031946035

SP - 2672

EP - 2678

BT - 26th International Joint Conference on Artificial Intelligence, IJCAI 2017

PB - International Joint Conferences on Artificial Intelligence

ER -