A Latent Source Model for Nonparametric Time Series Classification

For classifying time series, a nearest-neighbor approach is widely used in practice with performance often competitive with or better than more elaborate methods such as neural networks, decision trees, and support vector machines. We develop theoretical justification for the effectiveness of nearest-neighbor-like classification of time series. Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren't actually that many prototypical time series to begin with, relative to the number of time series we have access to, e.g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data. To operationalize this hypothesis, we propose a latent source model for time series, which naturally leads to a "weighted majority voting" classification rule that can be approximated by a nearest-neighbor classifier. We establish nonasymptotic performance guarantees of both weighted majority voting and nearest-neighbor classification under our model accounting for how much of the time series we observe and the model complexity. Experimental results on synthetic data show weighted majority voting achieving the same misclassification rate as nearest-neighbor classification while observing less of the time series. We then use weighted majority to forecast which news topics on Twitter become trends, where we are able to detect such "trending topics" in advance of Twitter 79% of the time, with a mean early advantage of 1 hour and 26 minutes, a true positive rate of 95%, and a false positive rate of 4%.

[1]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.

[2]  Yannis Manolopoulos,et al.  Feature-based classification of time-series data , 2001 .

[3]  Mario Cataldi,et al.  Emerging topic detection on Twitter based on temporal and social terms evaluation , 2010, MDMKDD '10.

[4]  Santosh S. Vempala,et al.  A spectral algorithm for learning mixture models , 2004, J. Comput. Syst. Sci..

[5]  Sham M. Kakade,et al.  Learning mixtures of spherical gaussians: moment methods and spectral decompositions , 2012, ITCS '13.

[6]  Edward Y. Chang,et al.  Distance-function design and fusion for sequence data , 2004, CIKM '04.

[7]  Ankur Moitra,et al.  Settling the Polynomial Learnability of Mixtures of Gaussians , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[8]  Sanjoy Dasgupta,et al.  A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians , 2007, J. Mach. Learn. Res..

[9]  Arindam Banerjee,et al.  Online (cid:96) 1 -Dictionary Learning with Application to Novel Document Detection , 2012 .

[10]  P. Massart,et al.  Adaptive estimation of a quadratic functional by model selection , 2000 .

[11]  Eamonn J. Keogh,et al.  A Complexity-Invariant Distance Measure for Time Series , 2011, SDM.

[12]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[13]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[14]  Andreas Krause,et al.  Scalable Training of Mixture Models via Coresets , 2011, NIPS.

[15]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[16]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[17]  Vikas Sindhwani,et al.  Emerging topic detection using dictionary learning , 2011, CIKM '11.

[18]  Juan José Rodríguez Diez,et al.  Interval and dynamic time warping-based decision trees , 2004, SAC '04.

[19]  Hila Becker,et al.  Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[20]  Bernardo A. Huberman,et al.  Trends in Social Media: Persistence and Decay , 2011, ICWSM.

[21]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[22]  Jason Lines,et al.  Transformation Based Ensembles for Time Series Classification , 2012, SDM.

[23]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.