Hand pose estimation for American sign language recognition

In the foreseeable future, gestured inputs will be widely used in human-computer interfaces. This paper describes our initial attempt at recognizing 2D hand poses for application in video-based human-computer interfaces. Specifically, this research focuses on 2-D image recognition utilizing an evolved wavelet-based feature vector. We have developed a two layer feed-forward neural network that recognizes the 24 static letters in the American sign language (ASL) alphabet using still input images. Thus far, two wavelet-based decomposition methods have been used. The first produces an 8-element real-valued feature vector and the second a 18-element feature vector. Each set of feature vectors is used to train a feed-forward neural network using Levenberg-Marquardt training. The system is capable of recognizing instances of static ASL fingerspelling with 99.9% accuracy with an SNR as low as 2. We conclude by describing issues to be resolved before expanding the corpus of ASL signs to be recognized.