Using Closed n-set Patterns for Spatio-Temporal Classification

Today, huge volumes of sensor data are collected from many different sources. One of the most crucial data mining tasks considering this data is the ability to predict and classify data to anticipate trends or failures and take adequate steps. While the initial data might be of limited interest itself, the use of additional information, e.g., latent attributes, spatio-temporal details, etc., can add significant values and interestingness. In this paper we present a classification approach, called Closed n-set Spatio-Temporal Classification (CnSC), which is based on the use of latent attributes, pattern mining, and classification model construction. As the amount of generated patterns is huge, we employ a scalable NoSQL-based graph database for efficient storage and retrieval. By considering hierarchies in the latent attributes, we define pattern and context similarity scores. The classification model for a specific context is constructed by aggregating the most similar patterns. Presented approach CnSC is evaluated with a real dataset and shows competitive results compared with other prediction strategies.

[1]  Anna Monreale,et al.  WhereNext: a location predictor on trajectory pattern mining , 2009, KDD.

[2]  Albrecht Zimmermann,et al.  One in a million: picking the right patterns , 2008, Knowledge and Information Systems.

[3]  Özgür Ulusoy,et al.  A data mining approach for location prediction in mobile environments , 2005, Data Knowl. Eng..

[4]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[5]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[6]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  Jianyong Wang,et al.  HARMONY: Efficiently Mining the Best Rules for Classification , 2005, SDM.

[9]  René Peinl,et al.  Performance of graph query languages: comparison of cypher, gremlin and native access in Neo4j , 2013, EDBT '13.

[10]  Jorge Adolfo Ramírez Uresti,et al.  Strategy Patterns Prediction Model (SPPM) , 2011, MICAI.

[11]  Jean-François Boulicaut,et al.  Closed patterns meet n-ary relations , 2009, TKDD.

[12]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[13]  Torben Bach Pedersen,et al.  Spatio-Temporal Ensemble Prediction on Mobile Broadband Network Data , 2013, 2013 IEEE 77th Vehicular Technology Conference (VTC Spring).

[14]  David Yarowsky,et al.  Classifying latent user attributes in twitter , 2010, SMUC '10.

[15]  Siegfried Nijssen,et al.  Pattern-Based Classification: A Unifying Perspective , 2011, ArXiv.