Spatio-Temporal Mask Learning: Application to Speech Recognition

In this paper, we describe the “spatio-temporal” map which is an original algorithm to learn and recognize dynamic patterns represented by sequences. This work is slanted toward an internal and explicit representation of time which seems to be neuro-biologically relevant. The map involves units with different kinds of links: feed-forward connections, intra-map connections and inter-map connections. This architecture is able to learn sequences robust to noise from an input stream. The learning process is self-organized for the feed-forward links and “pseudo” self-organized for the intra-map links. An application to French spoken digits recognition is presented.