Recognizing and Discovering Complex Events in Sequences

Finding complex patterns in long temporal or spatial sequences from real world applications is gaining increasing interest in data mining. However, standard data mining techniques, taken in isolation, seem to be inadequate to cope with such a task. In fact, symbolic approaches show difficulty in dealing with noise, while non-symbolic approaches, such as neural networks and statistics show difficulty in dealing with very long subsequences where relevant episodes may be interleaved with large gaps. The way out we suggest is to integrate the logic approach with non-symbolic methods in a unified paradigm, as it has been already done in other Artificial Intelligence tasks. We propose a framework where a high level knowledge representation is used to incorporate domain specific knowledge, to focus the attention on relevant episodes during the mining process, and flexible matching algorithms developed in the pattern recognition area are used to deal with noisy data. The knowledge extraction process follows a machine learning paradigm combining inductive and deductive learning, where deduction steps can be interleaved with induction steps aimed at augmenting a weak domain theory with knowledge extracted from the data. Our framework is formally characterized and then is experimentally tested on an artificial dataset showing its ability at dealing with noise and with the presence of long gaps between the relevant episodes.

[1]  S. Matwin,et al.  Learning Two-Tiered Descriptions of Flexible Concepts: The POSEIDON System , 1992, Machine Learning.

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[4]  Luca Console,et al.  Efficient Processing of Queries and Assertions about Qualitative and Quantitative Temporal Constraints , 1999 .

[5]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[6]  Cosimo Anglano,et al.  An Experimental Evaluation of Coevolutive Concept Learning , 1998, ICML.

[7]  Francesco Bergadano,et al.  Guiding induction with domain theories , 1990 .

[8]  M. Pazzani,et al.  The Utility of Knowledge in Inductive Learning , 1992, Machine Learning.

[9]  Rina Dechter,et al.  Temporal Constraint Networks , 1989, Artif. Intell..

[10]  X.S. Wang,et al.  Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences , 1998, IEEE Trans. Knowl. Data Eng..

[11]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[12]  Ernest Davis,et al.  Constraint Propagation with Interval Labels , 1987, Artif. Intell..

[13]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[14]  Itay Meiri,et al.  Combining Qualitative and Quantitative Constraints in Temporal Reasoning , 1991, Artif. Intell..