Identifying outliers using multiple kernel canonical correlation analysis with application to imaging genetics

Md Ashad Alam, Vince Daniel Calhoun, Yu Ping Wang

Research output: Contribution to journalArticlepeer-review

Abstract

Identifying significant outliers or atypical objects from multimodal datasets is an essential and challenging issue for biomedical research. This problem is addressed, using the influence function of multiple kernel canonical correlation analysis. First, the influence function (IF) of the kernel mean element, the kernel covariance operator, the kernel cross-covariance operator and kernel canonical correlation analysis (kernel CCA) are studied. Second, an IF of multiple kernel CCA is proposed, which can be applied to multimodal datasets. Third, a visualization method is proposed to detect influential observations of multiple sources of data based on the IF of kernel CCA and multiple kernel CCA. Finally, to validate the method, experiments on both synthesized and imaging genetics data (e.g., SNP, fMRI, and DNA methylation) are performed. To examine the outliers, both the stem-and-leaf display and distribution based technique are used. The performance of the proposed approach is illustrated on 116 candidate regions of interest (ROIs) from the fMRI data of schizophrenia study to identify significant ROIs. The proposed method and two state-of-the-art statistical methods have identified 8, 34, and 10 ROIs, respectively. Based on an online database, the brain mappings of the selected common 7 ROIs indicate the irregular brain regions susceptible to schizophrenia. The results demonstrate that the proposed method is capable of analyzing outliers and the influence of observations, and can be applicable to many other biomedical data which are often high-dimensional and multi-modal.

Original languageEnglish (US)
Pages (from-to)70-85
Number of pages16
JournalComputational Statistics and Data Analysis
Volume125
DOIs
StatePublished - Sep 1 2018
Externally publishedYes

Keywords

  • Imaging genetics
  • Influence function
  • Multimodal datasets
  • Multiple kernel CCA
  • Outlier detection

ASJC Scopus subject areas

  • Statistics and Probability
  • Computational Mathematics
  • Computational Theory and Mathematics
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Identifying outliers using multiple kernel canonical correlation analysis with application to imaging genetics'. Together they form a unique fingerprint.

Cite this