Shapelet ensemble for multi-dimensional time series

Mustafa S. Cetin, Abdullah Mueen, Vince Daniel Calhoun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Time series shapelets are small subsequences that maximally differentiate classes of time series. Since the inception of shapelets, researchers have used shapelets for various data domains including anthropology and health care, and in the process suggested many efficient techniques for shapelet discovery. However, multi-dimensional time series data poses unique challenges to shapelet discovery that are yet to be solved. We show that an ensemble of shapelet-based decision trees on individual dimensions works better than shapelets defined over multiple dimensions. Generating a shapelet ensemble for multidimensional time series is computationally expensive. Most of the existing techniques prune shapelet candidates for speed. In this paper, we propose a novel technique for shapelet discovery that evaluates remaining candidates efficiently. Our algorithm uses a multi-length approximate index for time series data to efficiently find the nearest neighbors of the candidate shapelets. We employ a simple skipping technique for additional candidate pruning and a voting based technique to improve accuracy while retaining interpretability. Not only do we find a significant speed increase, our techniques enable us to efficiently discover shapelets on datasets with multi-dimensional and long time series such as hours of brain activity recordings. We demonstrate our approach on a biomedical dataset and find significant differences between patients with schizophrenia and healthy controls.

Original languageEnglish (US)
Title of host publicationSIAM International Conference on Data Mining 2015, SDM 2015
PublisherSociety for Industrial and Applied Mathematics Publications
Pages307-315
Number of pages9
ISBN (Print)9781510811522
Publication statusPublished - 2015
Externally publishedYes
EventSIAM International Conference on Data Mining 2015, SDM 2015 - Vancouver, Canada
Duration: Apr 30 2015May 2 2015

Other

OtherSIAM International Conference on Data Mining 2015, SDM 2015
CountryCanada
CityVancouver
Period4/30/155/2/15

    Fingerprint

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Cetin, M. S., Mueen, A., & Calhoun, V. D. (2015). Shapelet ensemble for multi-dimensional time series. In SIAM International Conference on Data Mining 2015, SDM 2015 (pp. 307-315). Society for Industrial and Applied Mathematics Publications.