Effectiveness evaluation of word characteristics obtained from 3D image information for lipreading

Speech recognition using image information is worthy of remark as one of the next generation of man machine interfaces (MMIs). Several methods that use either voice information or voice information and image information for recognizing words, context and speech have been proposed. Compared to methods that use only voice information, the benefit of using image information is that it is not affected by unwanted sound noise, and so it is applicable in several different environments. However, in general, several constraints are required to capture an image, for example, camera position and the relationship between camera and face. We investigated the effectiveness of using three-dimensional image information for word recognition and found that these constraints are removed. To confirm the effectiveness of the proposed method, the characteristics of two- and three-dimensional images were compared. The results of the word recognition experiment show that the recognition rate for three-dimensional characteristics is higher than that for two-dimensional characteristics.