Frequency band‐importance functions for auditory and auditory‐visual sentence recognition

In many everyday listening environments, speech communication involves the integration of both acoustic and visual speech cues. This is especially true in noisy and reverberant environments where the speech signal is highly degraded, or when the listener has a hearing impairment. Understanding the mechanisms involved in auditory‐visual integration is a primary interest of this work. Of particular interest is whether listeners are able to allocate their attention to various frequency regions of the speech signal differently under auditory‐visual conditions and auditory‐alone conditions. For auditory speech recognition, the most important frequency regions tend to be around 1500–3000 Hz, corresponding roughly to important acoustic cues for place of articulation. The purpose of this study is to determine the most important frequency region under auditory‐visual speech conditions. Frequency band‐importance functions for auditory and auditory‐visual conditions were obtained by having subjects identify speech tokens under conditions where the speech‐to‐noise ratio of different parts of the speech spectrum is independently and randomly varied on every trial. Point biserial correlations were computed for each separate spectral region and the normalized correlations are interpreted as weights indicating the importance of each region. Relations among frequency‐importance functions for auditory and auditory‐visual conditions will be discussed.