Shapelet ensemble for multi-dimensional time series

Mustafa S. Cetin; Abdullah Mueen; Vince D. Calhoun

doi:10.1137/1.9781611974010.35

Shapelet ensemble for multi-dimensional time series

Mustafa S. Cetin, Abdullah Mueen, Vince D. Calhoun

School of Medicine

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

21 Scopus citations

Abstract

Time series shapelets are small subsequences that maximally differentiate classes of time series. Since the inception of shapelets, researchers have used shapelets for various data domains including anthropology and health care, and in the process suggested many efficient techniques for shapelet discovery. However, multi-dimensional time series data poses unique challenges to shapelet discovery that are yet to be solved. We show that an ensemble of shapelet-based decision trees on individual dimensions works better than shapelets defined over multiple dimensions. Generating a shapelet ensemble for multidimensional time series is computationally expensive. Most of the existing techniques prune shapelet candidates for speed. In this paper, we propose a novel technique for shapelet discovery that evaluates remaining candidates efficiently. Our algorithm uses a multi-length approximate index for time series data to efficiently find the nearest neighbors of the candidate shapelets. We employ a simple skipping technique for additional candidate pruning and a voting based technique to improve accuracy while retaining interpretability. Not only do we find a significant speed increase, our techniques enable us to efficiently discover shapelets on datasets with multi-dimensional and long time series such as hours of brain activity recordings. We demonstrate our approach on a biomedical dataset and find significant differences between patients with schizophrenia and healthy controls.

Original language	English (US)
Title of host publication	SIAM International Conference on Data Mining 2015, SDM 2015
Editors	Suresh Venkatasubramanian, Jieping Ye
Publisher	Society for Industrial and Applied Mathematics Publications
Pages	307-315
Number of pages	9
ISBN (Electronic)	9781510811522
DOIs	https://doi.org/10.1137/1.9781611974010.35
State	Published - 2015
Event	SIAM International Conference on Data Mining 2015, SDM 2015 - Vancouver, Canada Duration: Apr 30 2015 → May 2 2015

Publication series

Name	SIAM International Conference on Data Mining 2015, SDM 2015

Other

Other	SIAM International Conference on Data Mining 2015, SDM 2015
Country/Territory	Canada
City	Vancouver
Period	4/30/15 → 5/2/15

ASJC Scopus subject areas

Computational Theory and Mathematics
Computer Vision and Pattern Recognition
Software

Access to Document

10.1137/1.9781611974010.35

Cite this

Cetin, M. S., Mueen, A., & Calhoun, V. D. (2015). Shapelet ensemble for multi-dimensional time series. In S. Venkatasubramanian, & J. Ye (Eds.), SIAM International Conference on Data Mining 2015, SDM 2015 (pp. 307-315). (SIAM International Conference on Data Mining 2015, SDM 2015). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611974010.35

Shapelet ensemble for multi-dimensional time series. / Cetin, Mustafa S.; Mueen, Abdullah; Calhoun, Vince D.
SIAM International Conference on Data Mining 2015, SDM 2015. ed. / Suresh Venkatasubramanian; Jieping Ye. Society for Industrial and Applied Mathematics Publications, 2015. p. 307-315 (SIAM International Conference on Data Mining 2015, SDM 2015).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Cetin, MS, Mueen, A & Calhoun, VD 2015, Shapelet ensemble for multi-dimensional time series. in S Venkatasubramanian & J Ye (eds), SIAM International Conference on Data Mining 2015, SDM 2015. SIAM International Conference on Data Mining 2015, SDM 2015, Society for Industrial and Applied Mathematics Publications, pp. 307-315, SIAM International Conference on Data Mining 2015, SDM 2015, Vancouver, Canada, 4/30/15. https://doi.org/10.1137/1.9781611974010.35

@inproceedings{7eafbc0594b54dc2b1b3472db9150028,

title = "Shapelet ensemble for multi-dimensional time series",

abstract = "Time series shapelets are small subsequences that maximally differentiate classes of time series. Since the inception of shapelets, researchers have used shapelets for various data domains including anthropology and health care, and in the process suggested many efficient techniques for shapelet discovery. However, multi-dimensional time series data poses unique challenges to shapelet discovery that are yet to be solved. We show that an ensemble of shapelet-based decision trees on individual dimensions works better than shapelets defined over multiple dimensions. Generating a shapelet ensemble for multidimensional time series is computationally expensive. Most of the existing techniques prune shapelet candidates for speed. In this paper, we propose a novel technique for shapelet discovery that evaluates remaining candidates efficiently. Our algorithm uses a multi-length approximate index for time series data to efficiently find the nearest neighbors of the candidate shapelets. We employ a simple skipping technique for additional candidate pruning and a voting based technique to improve accuracy while retaining interpretability. Not only do we find a significant speed increase, our techniques enable us to efficiently discover shapelets on datasets with multi-dimensional and long time series such as hours of brain activity recordings. We demonstrate our approach on a biomedical dataset and find significant differences between patients with schizophrenia and healthy controls.",

author = "Cetin, {Mustafa S.} and Abdullah Mueen and Calhoun, {Vince D.}",

note = "Publisher Copyright: Copyright {\textcopyright} SIAM.; SIAM International Conference on Data Mining 2015, SDM 2015 ; Conference date: 30-04-2015 Through 02-05-2015",

year = "2015",

doi = "10.1137/1.9781611974010.35",

language = "English (US)",

series = "SIAM International Conference on Data Mining 2015, SDM 2015",

publisher = "Society for Industrial and Applied Mathematics Publications",

pages = "307--315",

editor = "Suresh Venkatasubramanian and Jieping Ye",

booktitle = "SIAM International Conference on Data Mining 2015, SDM 2015",

}

TY - GEN

T1 - Shapelet ensemble for multi-dimensional time series

AU - Cetin, Mustafa S.

AU - Mueen, Abdullah

AU - Calhoun, Vince D.

PY - 2015

Y1 - 2015

N2 - Time series shapelets are small subsequences that maximally differentiate classes of time series. Since the inception of shapelets, researchers have used shapelets for various data domains including anthropology and health care, and in the process suggested many efficient techniques for shapelet discovery. However, multi-dimensional time series data poses unique challenges to shapelet discovery that are yet to be solved. We show that an ensemble of shapelet-based decision trees on individual dimensions works better than shapelets defined over multiple dimensions. Generating a shapelet ensemble for multidimensional time series is computationally expensive. Most of the existing techniques prune shapelet candidates for speed. In this paper, we propose a novel technique for shapelet discovery that evaluates remaining candidates efficiently. Our algorithm uses a multi-length approximate index for time series data to efficiently find the nearest neighbors of the candidate shapelets. We employ a simple skipping technique for additional candidate pruning and a voting based technique to improve accuracy while retaining interpretability. Not only do we find a significant speed increase, our techniques enable us to efficiently discover shapelets on datasets with multi-dimensional and long time series such as hours of brain activity recordings. We demonstrate our approach on a biomedical dataset and find significant differences between patients with schizophrenia and healthy controls.

AB - Time series shapelets are small subsequences that maximally differentiate classes of time series. Since the inception of shapelets, researchers have used shapelets for various data domains including anthropology and health care, and in the process suggested many efficient techniques for shapelet discovery. However, multi-dimensional time series data poses unique challenges to shapelet discovery that are yet to be solved. We show that an ensemble of shapelet-based decision trees on individual dimensions works better than shapelets defined over multiple dimensions. Generating a shapelet ensemble for multidimensional time series is computationally expensive. Most of the existing techniques prune shapelet candidates for speed. In this paper, we propose a novel technique for shapelet discovery that evaluates remaining candidates efficiently. Our algorithm uses a multi-length approximate index for time series data to efficiently find the nearest neighbors of the candidate shapelets. We employ a simple skipping technique for additional candidate pruning and a voting based technique to improve accuracy while retaining interpretability. Not only do we find a significant speed increase, our techniques enable us to efficiently discover shapelets on datasets with multi-dimensional and long time series such as hours of brain activity recordings. We demonstrate our approach on a biomedical dataset and find significant differences between patients with schizophrenia and healthy controls.

UR - http://www.scopus.com/inward/record.url?scp=84946794479&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946794479&partnerID=8YFLogxK

U2 - 10.1137/1.9781611974010.35

DO - 10.1137/1.9781611974010.35

M3 - Conference contribution

AN - SCOPUS:84946794479

T3 - SIAM International Conference on Data Mining 2015, SDM 2015

SP - 307

EP - 315

BT - SIAM International Conference on Data Mining 2015, SDM 2015

A2 - Venkatasubramanian, Suresh

A2 - Ye, Jieping

PB - Society for Industrial and Applied Mathematics Publications

T2 - SIAM International Conference on Data Mining 2015, SDM 2015

Y2 - 30 April 2015 through 2 May 2015

ER -

Shapelet ensemble for multi-dimensional time series

Abstract

Publication series

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this