Fully unsupervised word learning from continuous speech using transitional probabilities of atomic acoustic events

This work presents a learning algorithm based on transitional probabilities of atomic acoustic events (vector quantized spectral features). The algorithm learns models for word-like units in speech without any supervision, and without a priori knowledge of phonemic or linguistic units. The learned models can be used to segment novel utterances into word-like units, supporting the theory that transitional probabilities of acoustic events could work as a bootstrapping mechanism of language learning. The performance of the algorithm is evaluated using a corpus of Finnish infant-directed speech.