MILES: Multiclass Imbalanced Learning in Ensembles through Selective Sampling

Ali Azari, Vandana P. Janeja, Scott Levin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Imbalanced learning is the problem of learning from datasets when the class proportions are highly imbalanced. Imbalanced datasets are increasingly seen in many domains and pose a challenge to traditional classification techniques. Learning from imbalanced multiclass data (three or more classes) creates additional complexities. Studies suggest that ensemble learners can be trained to emphasize different segments of data pertaining to different classes and thereby produce more accurate results than regular imbalance learning techniques. Thus, we propose a new approach to building ensembles of classifiers for multiclass imbalanced datasets, called Multiclass Imbalance Learning in Ensembles through Selective Sampling (MILES). Each member of MILES is trained with the data selectively sampled from the bands around cluster centroids in a way that diversity is aggressively encouraged within the ensemble. Resampling techniques are utilized to balance the distribution of the data that comes from each cluster. We performed several experiments applying our approach to different datasets demonstrating improved performance for recognizing minority class examples and balancing the G-mean and Mean Area Under the Curve (MAUC). We further applied MILES to classify prolonged emergency department (ED) stays with consistently higher performance as compared to existing methods. Copyright is held by the owner/author(s).

Original languageEnglish (US)
Title of host publication32nd Annual ACM Symposium on Applied Computing, SAC 2017
PublisherAssociation for Computing Machinery
Pages811-816
Number of pages6
ISBN (Electronic)9781450344869
DOIs
StatePublished - Apr 3 2017
Event32nd Annual ACM Symposium on Applied Computing, SAC 2017 - Marrakesh, Morocco
Duration: Apr 4 2017Apr 6 2017

Publication series

NameProceedings of the ACM Symposium on Applied Computing
VolumePart F128005

Other

Other32nd Annual ACM Symposium on Applied Computing, SAC 2017
CountryMorocco
CityMarrakesh
Period4/4/174/6/17

Keywords

  • Class imbalance
  • Ensemble learning
  • Multiclass classification

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'MILES: Multiclass Imbalanced Learning in Ensembles through Selective Sampling'. Together they form a unique fingerprint.

  • Cite this

    Azari, A., Janeja, V. P., & Levin, S. (2017). MILES: Multiclass Imbalanced Learning in Ensembles through Selective Sampling. In 32nd Annual ACM Symposium on Applied Computing, SAC 2017 (pp. 811-816). (Proceedings of the ACM Symposium on Applied Computing; Vol. Part F128005). Association for Computing Machinery. https://doi.org/10.1145/3019612.3019667