Neural Information Processing: 27th International Conference, ICONIP 2020, Bangkok, Thailand, November 18–22, 2020, Proceedings, Part IV

In the task of keyword spotting based on query-by-example, how to represent word images is a very important issue. Meanwhile, the problem of out-of-vocabulary (OOV) is frequently occurred in keyword spotting. Therefore, the problem of OOV keyword spotting is a challenging task. In this paper, a hybrid representation approach of word images has been presented to accomplish the aim of OOV keyword spotting. To be specific, a sequence to sequence model has been utilized to generate representation vectors of word images. Meanwhile, a CNN model with VGG16 architecture has been used to obtain another type of representation vectors. After that, a score fusion scheme is adopted to combine the above two kinds of representation vectors. Experimental results demonstrate that the proposed hybrid representation approach of word images is especially suited for solving the problem of OOV keyword spotting.