Sequence Prediction With Sparse Distributed Hyperdimensional Coding Applied to the Analysis of Mobile Phone Use Patterns

Modeling and prediction of temporal sequences is central to many signal processing and machine learning applications. Prediction based on sequence history is typically performed using parametric models, such as fixed-order Markov chains (n-grams), approximations of high-order Markov processes, such as mixed-order Markov models or mixtures of lagged bigram models, or with other machine learning techniques. This paper presents a method for sequence prediction based on sparse hyperdimensional coding of the sequence structure and describes how higher order temporal structures can be utilized in sparse coding in a balanced manner. The method is purely incremental, allowing real-time online learning and prediction with limited computational resources. Experiments with prediction of mobile phone use patterns, including the prediction of the next launched application, the next GPS location of the user, and the next artist played with the phone media player, reveal that the proposed method is able to capture the relevant variable-order structure from the sequences. In comparison with the n-grams and the mixed-order Markov models, the sparse hyperdimensional predictor clearly outperforms its peers in terms of unweighted average recall and achieves an equal level of weighted average recall as the mixed-order Markov chain but without the batch training of the mixed-order model.

[1]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[2]  Wentian Li Mutual information functions versus correlation functions , 1990 .

[3]  A. Raftery A model for high-order Markov chains , 1985 .

[4]  Jianwei Zhang,et al.  Robot navigation and manipulation based on a predictive associative memory , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[5]  Dmitri A. Rachkovskij,et al.  SIMILARITY‐BASED RETRIEVAL WITH STRUCTURE‐SENSITIVE SPARSE BINARY DISTRIBUTED REPRESENTATIONS , 2012, Comput. Intell..

[6]  Anders Holst,et al.  Random indexing of text samples for latent semantic analysis , 2000 .

[7]  Douglas G. Danforth,et al.  An empirical investigation of sparse distributed memory using discrete speech recognition , 1990 .

[8]  I. Grosse,et al.  MEASURING CORRELATIONS IN SYMBOL SEQUENCES , 1995 .

[9]  Carsten O. Daub,et al.  Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data , 2004, BMC Bioinformatics.

[10]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[11]  Tony A. Plate,et al.  Analogy retrieval and processing with distributed vector representations , 2000, Expert Syst. J. Knowl. Eng..

[12]  Mikko Terho Practical Approach to Real Time Contextual Data Access , 2012, EJC.

[13]  W. Ebeling,et al.  Finite sample effects in sequence analysis , 1994 .

[14]  Magnus Sahlgren,et al.  An Introduction to Random Indexing , 2005 .

[15]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Dirk Van den Poel,et al.  Investigating purchasing-sequence patterns for financial services using Markov, MTD and MTDg models , 2006, Eur. J. Oper. Res..

[17]  Carsten O. Daub,et al.  The mutual information: Detecting and evaluating dependencies between variables , 2002, ECCB.

[18]  Andrew Hunter,et al.  A modified sparse distributed memory model for extracting clean patterns from noisy inputs , 2009, 2009 International Joint Conference on Neural Networks.

[19]  Doina Precup,et al.  Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning , 2004, ECML.

[20]  P. Bühlmann,et al.  Variable Length Markov Chains: Methodology, Computing, and Software , 2004 .

[21]  Joy Bose,et al.  An associative memory for the on-line recognition and prediction of temporal sequences , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[22]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[23]  Meir Feder,et al.  A universal finite memory source , 1995, IEEE Trans. Inf. Theory.

[24]  Fernando Pereira,et al.  Aggregate and mixed-order Markov models for statistical language processing , 1997, EMNLP.

[25]  Pentti Kanerva,et al.  Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors , 2009, Cognitive Computation.

[26]  Richard W. Prager,et al.  The modified Kanerva model for automatic speech recognition , 1989 .

[27]  Timo Hämäläinen,et al.  Linearly Expandable Partial Tree Shape Architecture for Parallel Neurocomputer , 1996, ICANN.

[28]  D. Gática-Pérez,et al.  Towards rich mobile phone datasets: Lausanne data collection campaign , 2010 .

[29]  Sascha Jockel,et al.  Crossmodal learning and prediction of autobiographical episodic experiences using a sparse distributed memory , 2010 .

[30]  S.-S. Chen,et al.  Character recognition in a sparse distributed memory , 1991, IEEE Trans. Syst. Man Cybern..

[31]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[32]  Pentti Kanerva,et al.  Sparse Distributed Memory , 1988 .

[33]  Stephen I. Gallant,et al.  Representing Objects, Relations, and Sequences , 2013, Neural Computation.

[34]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[35]  Gillian M. Hayes,et al.  A new approach to Kanerva's sparse distributed memory , 1997, IEEE Trans. Neural Networks.

[36]  Thomas L. Griffiths,et al.  Approximating Bayesian inference with a sparse distributed memory system , 2013, CogSci.

[37]  Pentti Kanerva,et al.  Sparse distributed memory and related models , 1993 .

[38]  A. Raftery,et al.  The Mixture Transition Distribution Model for High-Order Markov Chains and Non-Gaussian Time Series , 2002 .

[39]  Tony A. Plate,et al.  Holographic reduced representations , 1995, IEEE Trans. Neural Networks.

[40]  Elmar Nöth,et al.  The INTERSPEECH 2012 Speaker Trait Challenge , 2012, INTERSPEECH.

[41]  Sidney K. D'Mello,et al.  Modified sparse distributed memory as transient episodic memory for cognitive software agents , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[42]  Jin-Hyuk Hong,et al.  Understanding and prediction of mobile application usage for smart phones , 2012, UbiComp.

[43]  Javier Snaider,et al.  Extended Sparse Distributed Memory and Sequence Storage , 2012, Cognitive Computation.