Local Filter Selection Boosts Performance of Automatic Speechreading

We examine general purpose unsupervised techniques for visual preprocesing in machine vision tasks. In particular we analyze a wide variety of principal component and independent component techniques in combination with stepwise regression methods for variable selection. The task at hand is recognition of the first four digits spoken in English using hidden Markov models (HMM) for the recognition system. Local representations consistently outperformed global representations in generalizing to new speakers while global representations performed better than local ones for speaker identification tasks. In addition, the use of a novel regression-based variable selection technique substantially boosted performance.