A concrete statistical realization of Kleinberg's stochastic discrimination for pattern recognition. Part I. Two-class classification

Dechang Chen, Peng Huang, Xiuzhen Cheng

Research output: Contribution to journalArticle

Abstract

The method of stochastic discrimination (SD) introduced by Kleinberg is a new method in statistical pattern recognition. It works by producing many weak classifiers and then combining them to form a strong classifier. However, the strict mathematical assumptions in Kleinberg [The Annals of Statistics 24 (1996) 2319-2349] are rarely met in practice. This paper provides an applicable way to realize the SD algorithm. We recast SD in a probability-space framework and present a concrete statistical realization of SD for two-class pattern recognition. We weaken Kleinberg's theoretically strict assumptions of uniformity and indiscernibility by introducing near uniformity and weak indiscernibility. Such weaker notions are easily encountered in practical applications. We present a systematic resampling method to produce weak classifiers and then establish corresponding classification rules of SD. We analyze the performance of SD theoretically and explain why SD is overtraining-resistant and why SD has a high convergence rate. Testing results on real and simulated data sets are also given.

Original languageEnglish (US)
Pages (from-to)1393-1412
Number of pages20
JournalAnnals of Statistics
Volume31
Issue number5
DOIs
StatePublished - Oct 2003
Externally publishedYes

Fingerprint

Pattern Recognition
Discrimination
Classifier
Uniformity
Resampling Methods
Class
Pattern recognition
Classification Rules
Probability Space
Convergence Rate
Statistics
Testing

Keywords

  • Accuracy
  • Discriminant function
  • Test set
  • Training set

ASJC Scopus subject areas

  • Mathematics(all)
  • Statistics and Probability

Cite this

A concrete statistical realization of Kleinberg's stochastic discrimination for pattern recognition. Part I. Two-class classification. / Chen, Dechang; Huang, Peng; Cheng, Xiuzhen.

In: Annals of Statistics, Vol. 31, No. 5, 10.2003, p. 1393-1412.

Research output: Contribution to journalArticle

@article{ceab8c18c6354a17aaef6fab752e3951,
title = "A concrete statistical realization of Kleinberg's stochastic discrimination for pattern recognition. Part I. Two-class classification",
abstract = "The method of stochastic discrimination (SD) introduced by Kleinberg is a new method in statistical pattern recognition. It works by producing many weak classifiers and then combining them to form a strong classifier. However, the strict mathematical assumptions in Kleinberg [The Annals of Statistics 24 (1996) 2319-2349] are rarely met in practice. This paper provides an applicable way to realize the SD algorithm. We recast SD in a probability-space framework and present a concrete statistical realization of SD for two-class pattern recognition. We weaken Kleinberg's theoretically strict assumptions of uniformity and indiscernibility by introducing near uniformity and weak indiscernibility. Such weaker notions are easily encountered in practical applications. We present a systematic resampling method to produce weak classifiers and then establish corresponding classification rules of SD. We analyze the performance of SD theoretically and explain why SD is overtraining-resistant and why SD has a high convergence rate. Testing results on real and simulated data sets are also given.",
keywords = "Accuracy, Discriminant function, Test set, Training set",
author = "Dechang Chen and Peng Huang and Xiuzhen Cheng",
year = "2003",
month = "10",
doi = "10.1214/aos/1065705112",
language = "English (US)",
volume = "31",
pages = "1393--1412",
journal = "Annals of Statistics",
issn = "0090-5364",
publisher = "Institute of Mathematical Statistics",
number = "5",

}

TY - JOUR

T1 - A concrete statistical realization of Kleinberg's stochastic discrimination for pattern recognition. Part I. Two-class classification

AU - Chen, Dechang

AU - Huang, Peng

AU - Cheng, Xiuzhen

PY - 2003/10

Y1 - 2003/10

N2 - The method of stochastic discrimination (SD) introduced by Kleinberg is a new method in statistical pattern recognition. It works by producing many weak classifiers and then combining them to form a strong classifier. However, the strict mathematical assumptions in Kleinberg [The Annals of Statistics 24 (1996) 2319-2349] are rarely met in practice. This paper provides an applicable way to realize the SD algorithm. We recast SD in a probability-space framework and present a concrete statistical realization of SD for two-class pattern recognition. We weaken Kleinberg's theoretically strict assumptions of uniformity and indiscernibility by introducing near uniformity and weak indiscernibility. Such weaker notions are easily encountered in practical applications. We present a systematic resampling method to produce weak classifiers and then establish corresponding classification rules of SD. We analyze the performance of SD theoretically and explain why SD is overtraining-resistant and why SD has a high convergence rate. Testing results on real and simulated data sets are also given.

AB - The method of stochastic discrimination (SD) introduced by Kleinberg is a new method in statistical pattern recognition. It works by producing many weak classifiers and then combining them to form a strong classifier. However, the strict mathematical assumptions in Kleinberg [The Annals of Statistics 24 (1996) 2319-2349] are rarely met in practice. This paper provides an applicable way to realize the SD algorithm. We recast SD in a probability-space framework and present a concrete statistical realization of SD for two-class pattern recognition. We weaken Kleinberg's theoretically strict assumptions of uniformity and indiscernibility by introducing near uniformity and weak indiscernibility. Such weaker notions are easily encountered in practical applications. We present a systematic resampling method to produce weak classifiers and then establish corresponding classification rules of SD. We analyze the performance of SD theoretically and explain why SD is overtraining-resistant and why SD has a high convergence rate. Testing results on real and simulated data sets are also given.

KW - Accuracy

KW - Discriminant function

KW - Test set

KW - Training set

UR - http://www.scopus.com/inward/record.url?scp=0242511570&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0242511570&partnerID=8YFLogxK

U2 - 10.1214/aos/1065705112

DO - 10.1214/aos/1065705112

M3 - Article

AN - SCOPUS:0242511570

VL - 31

SP - 1393

EP - 1412

JO - Annals of Statistics

JF - Annals of Statistics

SN - 0090-5364

IS - 5

ER -