Statistical analysis of latent generalized correlation matrix estimation in transelliptical distribution

Fang Han, Han Liu

Research output: Contribution to journalReview article

Abstract

Correlation matrices play a key role in many multivariate methods (e.g., graphical model estimation and factor analysis). The current state-of-the-art in estimating large correlation matrices focuses on the use of Pearson's sample correlation matrix. Although Pearson's sample correlation matrix enjoys various good properties under Gaussian models, it is not an effective estimator when facing heavy-tailed distributions. As a robust alternative, Han and Liu [J. Am. Stat. Assoc. 109 (2015) 275-287] advocated the use of a transformed version of the Kendall's tau sample correlation matrix in estimating high dimensional latent generalized correlation matrix under the transelliptical distribution family (or elliptical copula). The transelliptical family assumes that after unspecified marginal monotone transformations, the data follow an elliptical distribution. In this paper, we study the theoretical properties of the Kendall's tau sample correlation matrix and its transformed version proposed in Han and Liu [J. Am. Stat. Assoc. 109 (2015) 275-287] for estimating the population Kendall's tau correlation matrix and the latent Pearson's correlation matrix under both spectral and restricted spectral norms. With regard to the spectral norm, we highlight the role of "effective rank" in quantifying the rate of convergence. With regard to the restricted spectral norm, we for the first time present a "sign sub-Gaussian condition" which is sufficient to guarantee that the rank-based correlation matrix estimator attains the fast rate of convergence. In both cases, we do not need any moment condition.

Original languageEnglish (US)
Pages (from-to)23-57
Number of pages35
JournalBernoulli
Volume23
Issue number1
DOIs
StatePublished - Feb 1 2017

Fingerprint

Correlation Matrix
Statistical Analysis
Kendall's tau
Spectral Norm
Rate of Convergence
Estimator
Elliptical Distribution
Pearson Correlation
Heavy-tailed Distribution
Moment Conditions
Copula
Gaussian Model
Graphical Models
Factor Analysis
Monotone
High-dimensional
Sufficient
Alternatives

Keywords

  • Double asymptotics
  • Elliptical copula
  • Kendall's tau correlation matrix
  • Rate of convergence
  • Transelliptical model

ASJC Scopus subject areas

  • Statistics and Probability

Cite this

Statistical analysis of latent generalized correlation matrix estimation in transelliptical distribution. / Han, Fang; Liu, Han.

In: Bernoulli, Vol. 23, No. 1, 01.02.2017, p. 23-57.

Research output: Contribution to journalReview article

@article{a27723bbae57444696c16736cfb7ce82,
title = "Statistical analysis of latent generalized correlation matrix estimation in transelliptical distribution",
abstract = "Correlation matrices play a key role in many multivariate methods (e.g., graphical model estimation and factor analysis). The current state-of-the-art in estimating large correlation matrices focuses on the use of Pearson's sample correlation matrix. Although Pearson's sample correlation matrix enjoys various good properties under Gaussian models, it is not an effective estimator when facing heavy-tailed distributions. As a robust alternative, Han and Liu [J. Am. Stat. Assoc. 109 (2015) 275-287] advocated the use of a transformed version of the Kendall's tau sample correlation matrix in estimating high dimensional latent generalized correlation matrix under the transelliptical distribution family (or elliptical copula). The transelliptical family assumes that after unspecified marginal monotone transformations, the data follow an elliptical distribution. In this paper, we study the theoretical properties of the Kendall's tau sample correlation matrix and its transformed version proposed in Han and Liu [J. Am. Stat. Assoc. 109 (2015) 275-287] for estimating the population Kendall's tau correlation matrix and the latent Pearson's correlation matrix under both spectral and restricted spectral norms. With regard to the spectral norm, we highlight the role of {"}effective rank{"} in quantifying the rate of convergence. With regard to the restricted spectral norm, we for the first time present a {"}sign sub-Gaussian condition{"} which is sufficient to guarantee that the rank-based correlation matrix estimator attains the fast rate of convergence. In both cases, we do not need any moment condition.",
keywords = "Double asymptotics, Elliptical copula, Kendall's tau correlation matrix, Rate of convergence, Transelliptical model",
author = "Fang Han and Han Liu",
year = "2017",
month = "2",
day = "1",
doi = "10.3150/15-BEJ702",
language = "English (US)",
volume = "23",
pages = "23--57",
journal = "Bernoulli",
issn = "1350-7265",
publisher = "International Statistical Institute",
number = "1",

}

TY - JOUR

T1 - Statistical analysis of latent generalized correlation matrix estimation in transelliptical distribution

AU - Han, Fang

AU - Liu, Han

PY - 2017/2/1

Y1 - 2017/2/1

N2 - Correlation matrices play a key role in many multivariate methods (e.g., graphical model estimation and factor analysis). The current state-of-the-art in estimating large correlation matrices focuses on the use of Pearson's sample correlation matrix. Although Pearson's sample correlation matrix enjoys various good properties under Gaussian models, it is not an effective estimator when facing heavy-tailed distributions. As a robust alternative, Han and Liu [J. Am. Stat. Assoc. 109 (2015) 275-287] advocated the use of a transformed version of the Kendall's tau sample correlation matrix in estimating high dimensional latent generalized correlation matrix under the transelliptical distribution family (or elliptical copula). The transelliptical family assumes that after unspecified marginal monotone transformations, the data follow an elliptical distribution. In this paper, we study the theoretical properties of the Kendall's tau sample correlation matrix and its transformed version proposed in Han and Liu [J. Am. Stat. Assoc. 109 (2015) 275-287] for estimating the population Kendall's tau correlation matrix and the latent Pearson's correlation matrix under both spectral and restricted spectral norms. With regard to the spectral norm, we highlight the role of "effective rank" in quantifying the rate of convergence. With regard to the restricted spectral norm, we for the first time present a "sign sub-Gaussian condition" which is sufficient to guarantee that the rank-based correlation matrix estimator attains the fast rate of convergence. In both cases, we do not need any moment condition.

AB - Correlation matrices play a key role in many multivariate methods (e.g., graphical model estimation and factor analysis). The current state-of-the-art in estimating large correlation matrices focuses on the use of Pearson's sample correlation matrix. Although Pearson's sample correlation matrix enjoys various good properties under Gaussian models, it is not an effective estimator when facing heavy-tailed distributions. As a robust alternative, Han and Liu [J. Am. Stat. Assoc. 109 (2015) 275-287] advocated the use of a transformed version of the Kendall's tau sample correlation matrix in estimating high dimensional latent generalized correlation matrix under the transelliptical distribution family (or elliptical copula). The transelliptical family assumes that after unspecified marginal monotone transformations, the data follow an elliptical distribution. In this paper, we study the theoretical properties of the Kendall's tau sample correlation matrix and its transformed version proposed in Han and Liu [J. Am. Stat. Assoc. 109 (2015) 275-287] for estimating the population Kendall's tau correlation matrix and the latent Pearson's correlation matrix under both spectral and restricted spectral norms. With regard to the spectral norm, we highlight the role of "effective rank" in quantifying the rate of convergence. With regard to the restricted spectral norm, we for the first time present a "sign sub-Gaussian condition" which is sufficient to guarantee that the rank-based correlation matrix estimator attains the fast rate of convergence. In both cases, we do not need any moment condition.

KW - Double asymptotics

KW - Elliptical copula

KW - Kendall's tau correlation matrix

KW - Rate of convergence

KW - Transelliptical model

UR - http://www.scopus.com/inward/record.url?scp=84991800831&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84991800831&partnerID=8YFLogxK

U2 - 10.3150/15-BEJ702

DO - 10.3150/15-BEJ702

M3 - Review article

VL - 23

SP - 23

EP - 57

JO - Bernoulli

JF - Bernoulli

SN - 1350-7265

IS - 1

ER -