About this Journal Submit a Manuscript Table of Contents
The Scientific World Journal
Volume 2013 (2013), Article ID 386180, 11 pages
http://dx.doi.org/10.1155/2013/386180
Research Article

PRESEE: An MDL/MML Algorithm to Time-Series Stream Segmenting

1College of Computer Science & Technology, Chengdu University of Information Technology, Chengdu 610225, China
2School of Computing and Information Sciences, Florida International University, Miami, IN 33199, USA
3Department of Computer Science, Purdue University, West Lafayette, FL 47996, USA
4Guangxi Teachers Education University, Nanning 530001, China
5School of Computer Science, Sichuan University, Chengdu 610065, China

Received 31 March 2013; Accepted 9 May 2013

Academic Editors: R. Haber, S.-S. Liaw, J. Ma, and R. Valencia-Garcia

Copyright © 2013 Kaikuo Xu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Time-series stream is one of the most common data types in data mining field. It is prevalent in fields such as stock market, ecology, and medical care. Segmentation is a key step to accelerate the processing speed of time-series stream mining. Previous algorithms for segmenting mainly focused on the issue of ameliorating precision instead of paying much attention to the efficiency. Moreover, the performance of these algorithms depends heavily on parameters, which are hard for the users to set. In this paper, we propose PRESEE (parameter-free, real-time, and scalable time-series stream segmenting algorithm), which greatly improves the efficiency of time-series stream segmenting. PRESEE is based on both MDL (minimum description length) and MML (minimum message length) methods, which could segment the data automatically. To evaluate the performance of PRESEE, we conduct several experiments on time-series streams of different types and compare it with the state-of-art algorithm. The empirical results show that PRESEE is very efficient for real-time stream datasets by improving segmenting speed nearly ten times. The novelty of this algorithm is further demonstrated by the application of PRESEE in segmenting real-time stream datasets from ChinaFLUX sensor networks data stream.