Combining confusion networks with probabilistic phone matching for open-vocabulary keyword spotting in spontaneous speech signal

In this paper, we study several methods for keyword spotting in spontaneous speech signal. Novel method combining probabilistic phone matching (PSM) approach with word confusion networks (WCN) is proposed for open-vocabulary keyword spotting task. This method runs keyword spotting on multi-level transcriptions (WCN and phone-onebest). We propose to use classical string matching for word spotting on WCN. At the same time probabilistic string matching is used for acoustic word spotting on phone-onebest transcription. It is verified that the novel hybrid method outperforms WCN-based and PSM-based approaches in-vocabulary and out-of-vocabulary (OOV) keywords.