A Classification Approach for Prediction of Target Events in Temporal Sequences

Learning to predict significant events from sequences of data with categorical features is an important problem in many application areas. We focus on events for system management, and formulate the problem of prediction as a classification problem. We perform co-occurrence analysis of events by means of Singular Value Decomposition (SVD) of the examples constructed from the data. This process is combined with Support Vector Machine (SVM) classification, to obtain efficient and accurate predictions. We conduct an analysis of statistical properties of event data, which explains why SVM classification is suitable for such data, and perform an empirical study using real data.

[1]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[2]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[3]  B. S. Manjunath,et al.  An eigenspace update algorithm for image analysis , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[4]  Ambuj K. Singh,et al.  Dimensionality Reduction for Similarity Searching in Dynamic Databases , 1999, Comput. Vis. Image Underst..

[5]  R. D. DeGroat,et al.  Efficient, numerically stabilized rank-one eigenstructure updating [signal processing] , 1990, IEEE Trans. Acoust. Speech Signal Process..

[6]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[7]  C. A. Murthy,et al.  Data condensation in large databases by incremental learning with support vector machines , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[8]  C. L. Giles,et al.  Sequence Learning - Paradigms, Algorithms, and Applications , 2001 .

[9]  Haym Hirsh,et al.  Learning to Predict Rare Events in Event Sequences , 1998, KDD.

[10]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[11]  Christos Faloutsos,et al.  The "DGX" distribution for mining massive, skewed data , 2001, KDD '01.

[12]  Joseph L. Hellerstein,et al.  Rule Induction of Computer Events , 2001, DSOM.

[13]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[15]  Heikki Mannila,et al.  Discovering Frequent Episodes in Sequences , 1995, KDD.

[16]  G. Zipf,et al.  Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. , 1949 .

[17]  Richard A. Davis,et al.  Introduction to time series and forecasting , 1998 .

[18]  Stefan Rüping,et al.  Incremental Learning with Support Vector Machines , 2001, ICDM.

[19]  Mohammed J. Zaki Sequence mining in categorical domains: incorporating constraints , 2000, CIKM '00.

[20]  Mohammed J. Zaki Sequence Mining in Categorical Domains: Algorithms and Applications , 2001, Sequence Learning.

[21]  Thorsten Joachims,et al.  The Maximum-Margin Approach to Learning Text Classifiers , 2001, Künstliche Intell..