Parallelization of Automatic Speech Recognition
暂无分享,去创建一个
The automatic recognition of spoken words is increasingly common, for dictaphone applications, telephone services or the command of various devices by disabled persons. In the latter case, a high recognition rate is expected on a vocabulary of small to medium size. To achieve this goal, the model must be refined. Thus, both the training stage and the recognition stage for such applications can be very time consuming and occasional re-training may happen. Its parallelization is thus worth considering. In this paper we present firstly the models we use: the classical hidden Markov model and another model that takes into account the prosody of speech, namely the centisecond two-level hidden Markov model introduced by Meziane [10]. Then two parallelization strategies are detailed: the first one simply shares the vocabulary among the processors, the second one also distributes the model. Experimental results highlight the need for a finer load-balancing: an a priori load estimation is presented and is used to statically balance the computational load between the processors. Further experiments have been conducted and exhibit efficiencies higher than 65% on an architecture composed of 12 Pentium Pro interconnected via Myrinet. Directions for improving further the parallelization are given.