GSOM sequence: An unsupervised dynamic approach for knowledge discovery in temporal data

A significant problem which arises during the process of knowledge discovery is dealing with data which have temporal dependencies. The attributes associated with temporal data need to be processed differently from non temporal attributes. A typical approach to address this issue is to view temporal data as an ordered sequence of events. In this work, we propose a novel dynamic unsupervised learning approach to discover patterns in temporal data. The new technique is based on the Growing Self-Organization Map (GSOM), which is a structure adapting version of the Self-Organizing Map (SOM). The SOM is widely used in knowledge discovery applications due to its unsupervised learning nature, ease of use and visualization capabilities. The GSOM further enhances the SOM with faster processing, more representative cluster formation and the ability to control map spread. This paper describes a significant extension to the GSOM enabling it to be used to for analyzing data with temporal sequences. The similarity between two time dependent sequences with unequal length is estimated using the Dynamic Time Warping (DTW) algorithm incorporated into the GSOM. Experiments were carried out to evaluate the performance and the validity of the proposed approach using an audio-visual data set. The results demonstrate that the novel “GSOM Sequence” algorithm improves the accuracy and validity of the clusters obtained.

[1]  Damminda Alahakoon,et al.  Dynamic self organizing maps for discovery and sharing of knowledge in multi agent systems , 2005, Web Intell. Agent Syst..

[2]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[4]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[5]  Durga Toshniwal,et al.  Using Cumulative Weighted Slopes for Clustering Time Series Data , 2005 .

[6]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[7]  Cláudia Antunes,et al.  Temporal Data Mining: an overview , 2001 .

[8]  Panu Somervuo,et al.  Analyzing Bird Song Syllables on the Self-Organizing Map , 2003 .

[9]  Osmar R. Zaïane,et al.  Proceedings of the Second International Workshop on Multimedia Data Mining, MDM/KDD'2001, August 26th, 2001, San Francisco, CA, USA , 2001, MDM/KDD.

[10]  Piotr Indyk,et al.  Mining the stock market (extended abstract): which measure is best? , 2000, KDD '00.

[11]  Robert F. Murphy,et al.  Automated analysis of protein subcellular location in time series images , 2010, Bioinform..

[12]  Javier R. Movellan,et al.  Visual Speech Recognition with Stochastic Networks , 1994, NIPS.

[13]  John G. Taylor,et al.  The temporal Kohönen map , 1993, Neural Networks.

[14]  Sean D. Campbell,et al.  Weather Forecasting for Weather Derivatives , 2002 .

[15]  Jessica Lin,et al.  Visually mining and monitoring massive time series , 2004, KDD.

[16]  Vit Niennattrakul,et al.  On Clustering Multimedia Time Series Data Using K-Means and Dynamic Time Warping , 2007, 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07).

[17]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[18]  José Carlos Príncipe,et al.  Spatio-temporal self-organizing feature maps , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[19]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[20]  Haiying Wang,et al.  An integrative and interactive framework for improving biomedical pattern discovery and visualization , 2004, IEEE Transactions on Information Technology in Biomedicine.

[21]  Bala Srinivasan,et al.  Dynamic self-organizing maps with controlled growth for knowledge discovery , 2000, IEEE Trans. Neural Networks Learn. Syst..

[22]  Saman K. Halgamuge,et al.  An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data , 2003, Bioinform..

[23]  Gregory Dudek,et al.  Clustering sensor data for autonomous terrain identification using time-dependency , 2009, Auton. Robots.

[24]  Panu Somervuo Online algorithm for the self-organizing map of symbol strings , 2004, Neural Networks.

[25]  Risto Miikkulainen,et al.  SARDNET: A Self-Organizing Feature Map for Sequences , 1994, NIPS.

[26]  Panu Somervuo Competing hidden Markov models on the self-organizing map , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[27]  Kate Smith-Miles,et al.  Clustering Massive High Dimensional Data with Dynamic Feature Maps , 2006, ICONIP.

[28]  Khurshid Ahmad,et al.  Modeling Multisensory Enhancement with Self-organizing Maps , 2009, Front. Comput. Neurosci..

[29]  Joseph B. Kruskal,et al.  Time Warps, String Edits, and Macromolecules , 1999 .

[30]  Dragomir Anguelov,et al.  Mining The Stock Market : Which Measure Is Best ? , 2000 .

[31]  Jari Kangas Phoneme recognition using time-dependent versions of self-organizing maps , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[32]  Tamas Jantvik,et al.  Sensory integration : success & failure , 2009 .

[33]  Saman K. Halgamuge,et al.  Scalable Dynamic Self-Organising Maps for Mining Massive Textual Data , 2006, ICONIP.

[34]  Damminda Alahakoon,et al.  Exploratory data analysis with Multi-Layer Growing Self-Organizing Maps , 2010, 2010 Fifth International Conference on Information and Automation for Sustainability.