Self-Organizing Hidden Markov Model Map (SOHMMM)

A hybrid approach combining the Self-Organizing Map (SOM) and the Hidden Markov Model (HMM) is presented. The Self-Organizing Hidden Markov Model Map (SOHMMM) establishes a cross-section between the theoretic foundations and algorithmic realizations of its constituents. The respective architectures and learning methodologies are fused in an attempt to meet the increasing requirements imposed by the properties of deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and protein chain molecules. The fusion and synergy of the SOM unsupervised training and the HMM dynamic programming algorithms bring forth a novel on-line gradient descent unsupervised learning algorithm, which is fully integrated into the SOHMMM. Since the SOHMMM carries out probabilistic sequence analysis with little or no prior knowledge, it can have a variety of applications in clustering, dimensionality reduction and visualization of large-scale sequence spaces, and also, in sequence discrimination, search and classification. Two series of experiments based on artificial sequence data and splice junction gene sequences demonstrate the SOHMMM's characteristics and capabilities.

[1]  Pierre Baldi,et al.  Smooth On-Line Learning Algorithms for Hidden Markov Models , 1994, Neural Computation.

[2]  Marc F. J. Drossaers,et al.  An Extended Kohonen Feature Map for Sentence Recognition , 1993 .

[3]  Mikko Kurimo,et al.  Training mixture density HMMs with SOM and LVQ , 1997, Comput. Speech Lang..

[4]  Andreas Stafylopatis,et al.  Sequence clustering with the Self-Organizing Hidden Markov Model Map , 2008, 2008 8th IEEE International Conference on BioInformatics and BioEngineering.

[5]  Jing Kang,et al.  Prediction of Chatter in Machining Process Based on Hybrid SOM-DHMM Architecture , 2009, ICIC.

[6]  Naoyuki Tsuruta,et al.  Self-Organizing Feature Maps for HMM Based Lip-Reading , 2003, KES.

[7]  Ah Chung Tsoi,et al.  A self-organizing map for adaptive processing of structured data , 2003, IEEE Trans. Neural Networks.

[8]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[9]  B. Hammer,et al.  Topographic Processing of Relational Data , 2007 .

[10]  Klaus Obermayer,et al.  Self-organizing maps: Generalizations and new optimization techniques , 1998, Neurocomputing.

[11]  Tom Heskes,et al.  Transition times in self-organizing maps , 1996, Biological Cybernetics.

[12]  Andreas Stafylopatis,et al.  Scaled On-line Unsupervised Learning Algorithm for a SOM-HMM Hybrid , 2011, ISCIS.

[13]  Panu Somervuo,et al.  Self-organizing maps of symbol strings , 1998, Neurocomputing.

[14]  Susanne Hoche,et al.  Scaling Boosting by Margin-Based Inclusionof Features and Relations , 2002, ECML.

[15]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[16]  E. Mizutani,et al.  Neuro-Fuzzy and Soft Computing-A Computational Approach to Learning and Machine Intelligence [Book Review] , 1997, IEEE Transactions on Automatic Control.

[17]  D. Mount Bioinformatics: Sequence and Genome Analysis , 2001 .

[18]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[19]  George Karypis,et al.  Evaluation of Techniques for Classifying Biological Sequences , 2002, PAKDD.

[20]  Kadim Tasdemir Graph Based Representations of Density Distribution and Distances for Self-Organizing Maps , 2010, IEEE Transactions on Neural Networks.

[21]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[22]  Barbara Hammer,et al.  Topographic Mapping of Large Dissimilarity Data Sets , 2010, Neural Computation.

[23]  Pierre Baldi,et al.  Gradient descent learning algorithm overview: a general dynamical systems perspective , 1995, IEEE Trans. Neural Networks.

[24]  Panu Somervuo Online algorithm for the self-organizing map of symbol strings , 2004, Neural Networks.

[25]  Risto Miikkulainen,et al.  SARDNET: A Self-Organizing Feature Map for Sequences , 1994, NIPS.

[26]  Panu Somervuo Competing hidden Markov models on the self-organizing map , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[27]  Ah Chung Tsoi,et al.  Contextual Processing of Graphs using Self-Organizing Maps , 2005, ESANN.

[28]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[29]  Ben J. A. Kröse,et al.  Self-organizing mixture models , 2005, Neurocomputing.

[30]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[31]  Younès Bennani,et al.  The structure of verbal sequences analyzed with unsupervised learning techniques , 2007, ArXiv.

[32]  Erzsébet Merényi,et al.  Exploiting Data Topology in Visualization and Clustering of Self-Organizing Maps , 2009, IEEE Transactions on Neural Networks.

[33]  Gunnar Rätsch,et al.  New Methods for Splice Site Recognition , 2002, ICANN.

[34]  Takéhiko Nakama,et al.  Theoretical analysis of batch and on-line training for gradient descent learning in neural networks , 2009, Neurocomputing.

[35]  Alessandro Sperduti Neural Networks for Adaptive Processing of Structured Data , 2001, ICANN.

[36]  Aluizio F. R. Araújo,et al.  A Taxonomy for Spatiotemporal Connectionist Networks Revisited: The Unsupervised Case , 2003, Neural Computation.

[37]  Anil K. Jain,et al.  A nonlinear projection method based on Kohonen's topology preserving maps , 1992, IEEE Trans. Neural Networks.

[38]  John G. Taylor,et al.  The temporal Kohönen map , 1993, Neural Networks.

[39]  Tom Heskes,et al.  Self-organizing maps, vector quantization, and mixture modeling , 2001, IEEE Trans. Neural Networks.

[40]  Wolfgang Rosenstiel,et al.  Automatic Cluster Detection in Kohonen's SOM , 2008, IEEE Transactions on Neural Networks.

[41]  Jukka Heikkonen,et al.  Time Series Predicition using Recurrent SOM with Local Linear Models , 1997 .

[42]  Mikko Kurimo,et al.  Using the self-organizing map to speed up the probability density estimation for speech recognition with mixture density HMMs , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[43]  Igor Farkas,et al.  Experimental comparison of recursive self-organizing maps for processing tree-structured data , 2010, Neurocomputing.

[44]  Lakhmi C. Jain,et al.  Self-Organizing neural networks: recent advances and applications , 2001 .

[45]  Jinyan Li,et al.  Using Rules to Analyse Bio-medical Data: A Comparison between C4.5 and PCL , 2003, WAIM.

[46]  Mustapha Lebbah,et al.  BeSOM : Bernoulli on Self-Organizing Map , 2007, 2007 International Joint Conference on Neural Networks.

[47]  T. Heskes Energy functions for self-organizing maps , 1999 .

[48]  Andreas Stafylopatis,et al.  A Hybrid Self-Organizing Model for Sequence Analysis , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.

[49]  Salvatore Rampone,et al.  Recognition of splice junctions on DNA sequences by BRAIN learning algorithm , 1998, Bioinform..

[50]  Thomas Voegtlin,et al.  Recursive self-organizing maps , 2002, Neural Networks.

[51]  Akihiko Konagaya,et al.  Stochastic Motif Extraction Using Hidden Markov Model , 1994, ISMB.

[52]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[53]  A. Ultsch Maps for the Visualization of high-dimensional Data Spaces , 2003 .

[54]  Fabrice Rossi,et al.  Fast Algorithm and Implementation of Dissimilarity Self-Organizing Maps , 2006, Neural Networks.

[55]  T. Koski Hidden Markov Models for Bioinformatics , 2001 .

[56]  Barbara Hammer,et al.  Self-organizing context learning , 2004, ESANN.

[57]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[58]  Andreas Stolcke,et al.  Hidden Markov Model} Induction by Bayesian Model Merging , 1992, NIPS.

[59]  Pierre Baldi,et al.  Bioinformatics - the machine learning approach (2. ed.) , 2000 .

[60]  Ke-Lin Du,et al.  Clustering: A neural network approach , 2010, Neural Networks.

[61]  Panu Somervuo,et al.  How to make large self-organizing maps for nonvectorial data , 2002, Neural Networks.

[62]  K. Torkkola,et al.  Training continuous density hidden Markov models in association with self-organizing maps and LVQ , 1992, Neural Networks for Signal Processing II Proceedings of the 1992 IEEE Workshop.

[63]  Alessio Micheli,et al.  A general framework for unsupervised processing of structured data , 2004, Neurocomputing.

[64]  Pierre Baldi,et al.  Hybrid Modeling, HMM/NN Architectures, and Protein Applications , 1996, Neural Computation.