CODA: High dimensional Copula Discriminant Analysis

Fang Han, Tuo Zhao, Han Liu

Research output: Contribution to journalArticle

Abstract

We propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA). The CODA generalizes the normal-based linear discriminant analysis to the larger Gaussian Copula models (or the nonparanormal) as proposed by Liu et al. (2009). To simultaneously achieve estimation efficiency and robustness, the nonparametric rank-based methods including the Spearman's rho and Kendall's tau are exploited in estimating the covariance matrix. In high dimensional settings, we prove that the sparsity pattern of the discriminant features can be consistently recovered with the parametric rate, and the expected misclassification error is consistent to the Bayes risk. Our theory is backed up by careful numerical experiments, which show that the extra flexibility gained by the CODA method incurs little efficiency loss even when the data are truly Gaussian. These results suggest that the CODA method can be an alternative choice besides the normal-based high dimensional linear discriminant analysis.

Original languageEnglish (US)
Pages (from-to)629-671
Number of pages43
JournalJournal of Machine Learning Research
Volume14
Issue number1
StatePublished - Feb 2013

Fingerprint

Copula
Discriminant analysis
Discriminant Analysis
High-dimensional
Spearman's rho
Misclassification Error
Bayes Risk
Kendall's tau
Copula Models
Gaussian Model
Covariance matrix
Sparsity
Discriminant
Flexibility
Numerical Experiment
Robustness
Generalise
Alternatives
Experiments

Keywords

  • Gaussian Copula
  • High dimensional statistics
  • Nonparanormal distribution
  • Rank-based statistics
  • Sparse nonlinear discriminant analysis

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Cite this

CODA : High dimensional Copula Discriminant Analysis. / Han, Fang; Zhao, Tuo; Liu, Han.

In: Journal of Machine Learning Research, Vol. 14, No. 1, 02.2013, p. 629-671.

Research output: Contribution to journalArticle

Han, F, Zhao, T & Liu, H 2013, 'CODA: High dimensional Copula Discriminant Analysis', Journal of Machine Learning Research, vol. 14, no. 1, pp. 629-671.
Han, Fang ; Zhao, Tuo ; Liu, Han. / CODA : High dimensional Copula Discriminant Analysis. In: Journal of Machine Learning Research. 2013 ; Vol. 14, No. 1. pp. 629-671.
@article{d72925336c36499a824a225693071e0d,
title = "CODA: High dimensional Copula Discriminant Analysis",
abstract = "We propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA). The CODA generalizes the normal-based linear discriminant analysis to the larger Gaussian Copula models (or the nonparanormal) as proposed by Liu et al. (2009). To simultaneously achieve estimation efficiency and robustness, the nonparametric rank-based methods including the Spearman's rho and Kendall's tau are exploited in estimating the covariance matrix. In high dimensional settings, we prove that the sparsity pattern of the discriminant features can be consistently recovered with the parametric rate, and the expected misclassification error is consistent to the Bayes risk. Our theory is backed up by careful numerical experiments, which show that the extra flexibility gained by the CODA method incurs little efficiency loss even when the data are truly Gaussian. These results suggest that the CODA method can be an alternative choice besides the normal-based high dimensional linear discriminant analysis.",
keywords = "Gaussian Copula, High dimensional statistics, Nonparanormal distribution, Rank-based statistics, Sparse nonlinear discriminant analysis",
author = "Fang Han and Tuo Zhao and Han Liu",
year = "2013",
month = "2",
language = "English (US)",
volume = "14",
pages = "629--671",
journal = "Journal of Machine Learning Research",
issn = "1532-4435",
publisher = "Microtome Publishing",
number = "1",

}

TY - JOUR

T1 - CODA

T2 - High dimensional Copula Discriminant Analysis

AU - Han, Fang

AU - Zhao, Tuo

AU - Liu, Han

PY - 2013/2

Y1 - 2013/2

N2 - We propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA). The CODA generalizes the normal-based linear discriminant analysis to the larger Gaussian Copula models (or the nonparanormal) as proposed by Liu et al. (2009). To simultaneously achieve estimation efficiency and robustness, the nonparametric rank-based methods including the Spearman's rho and Kendall's tau are exploited in estimating the covariance matrix. In high dimensional settings, we prove that the sparsity pattern of the discriminant features can be consistently recovered with the parametric rate, and the expected misclassification error is consistent to the Bayes risk. Our theory is backed up by careful numerical experiments, which show that the extra flexibility gained by the CODA method incurs little efficiency loss even when the data are truly Gaussian. These results suggest that the CODA method can be an alternative choice besides the normal-based high dimensional linear discriminant analysis.

AB - We propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA). The CODA generalizes the normal-based linear discriminant analysis to the larger Gaussian Copula models (or the nonparanormal) as proposed by Liu et al. (2009). To simultaneously achieve estimation efficiency and robustness, the nonparametric rank-based methods including the Spearman's rho and Kendall's tau are exploited in estimating the covariance matrix. In high dimensional settings, we prove that the sparsity pattern of the discriminant features can be consistently recovered with the parametric rate, and the expected misclassification error is consistent to the Bayes risk. Our theory is backed up by careful numerical experiments, which show that the extra flexibility gained by the CODA method incurs little efficiency loss even when the data are truly Gaussian. These results suggest that the CODA method can be an alternative choice besides the normal-based high dimensional linear discriminant analysis.

KW - Gaussian Copula

KW - High dimensional statistics

KW - Nonparanormal distribution

KW - Rank-based statistics

KW - Sparse nonlinear discriminant analysis

UR - http://www.scopus.com/inward/record.url?scp=84875190972&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84875190972&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84875190972

VL - 14

SP - 629

EP - 671

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1532-4435

IS - 1

ER -