Temporal Feature Selection on Networked Time Series

This paper formulates the problem of learning discriminative features (\textit{i.e.,} segments) from networked time series data considering the linked information among time series. For example, social network users are considered to be social sensors that continuously generate social signals (tweets) represented as a time series. The discriminative segments are often referred to as \emph{shapelets} in a time series. Extracting shapelets for time series classification has been widely studied. However, existing works on shapelet selection assume that the time series are independent and identically distributed (i.i.d.). This assumption restricts their applications to social networked time series analysis, since a user's actions can be correlated to his/her social affiliations. In this paper we propose a new Network Regularized Least Squares (NetRLS) feature selection model that combines typical time series data and user network data for analysis. Experiments on real-world networked time series Twitter and DBLP data demonstrate the performance of the proposed method. NetRLS performs better than LTS, the state-of-the-art time series feature selection approach, on real-world data.

[1]  Kevin Costello 2 Random Walks on Directed Graphs - An Example , 2005 .

[2]  Eamonn J. Keogh,et al.  Fast Shapelets: A Scalable Algorithm for Discovering Time Series Shapelets , 2013, SDM.

[3]  Eamonn J. Keogh,et al.  Time series shapelets: a novel technique that allows accurate, interpretable and fast classification , 2010, Data Mining and Knowledge Discovery.

[4]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[5]  Li Wei,et al.  Assumption-Free Anomaly Detection in Time Series , 2005, SSDBM.

[6]  Yutaka Matsuo,et al.  Tweet Analysis for Real-Time Event Detection and Earthquake Reporting System Development , 2013, IEEE Transactions on Knowledge and Data Engineering.

[7]  Chengqi Zhang,et al.  Defragging Subgraph Features for Graph Classification , 2015, CIKM.

[8]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[9]  Ivor W. Tsang,et al.  Incremental Subgraph Feature Selection for Graph Classification , 2017, IEEE Transactions on Knowledge and Data Engineering.

[10]  Lars Schmidt-Thieme,et al.  Learning time-series shapelets , 2014, KDD.

[11]  Stephen P. Boyd Convex optimization of graph Laplacian eigenvalues , 2006 .

[12]  Eamonn J. Keogh,et al.  Detecting time series motifs under uniform scaling , 2007, KDD '07.

[13]  Eamonn J. Keogh,et al.  Logical-shapelets: an expressive primitive for time series classification , 2011, KDD.

[14]  Jiawei Han,et al.  Towards feature selection in network , 2011, CIKM '11.

[15]  Chengqi Zhang,et al.  Time-Variant Graph Classification , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[16]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[17]  Ya-Ju Fan,et al.  Finding Motifs in Wind Generation Time Series Data , 2012, 2012 11th International Conference on Machine Learning and Applications.

[18]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[19]  Peng Dai-yuan Random Walks on Directed Graph , 2003 .

[20]  Jason Lines,et al.  A shapelet transform for time series classification , 2012, KDD.

[21]  Walid G. Aref,et al.  Periodicity detection in time series databases , 2005, IEEE Transactions on Knowledge and Data Engineering.

[22]  Amy McGovern,et al.  Identifying predictive multi-dimensional time series motifs: an application to severe weather prediction , 2010, Data Mining and Knowledge Discovery.