Amplitude Onsets and Spectral Energy in Perceptual Experience

In a recent review, Usha Goswami outlined a central role for disordered processing of the rise times of acoustic signals in developmental dyslexia. This approach was placed within the context of amplitude modulation spectra in sounds and in neural oscillations, with an emphasis on problems with rise time processing at low frequency amplitude modulations (1.5–4 Hz), which are proposed to form an “edge” to syllables in speech (Goswami, 2011). In the perceptual experience of sound sequences, however, not all signal rise times are equal in their effects, and the onset phenomena that Goswami discusses need to be considered in the context of the spectral energy of sounds. For example, Goswami correctly stresses that people with dyslexia experience problems with onset/rhyme processing in syllables (e.g., in producing spoonerisms), but the rise time of low amplitude modulations in a syllable is not a useful marker of the onset/rhyme distinction, unless the syllable happens to start with a vowel. In the auditory periphery, amplitude variation is represented within different channels of spectral energy (e.g., Irino and Patterson, 2001), and the rise times of the energy within these channels can vary considerably. For example, Figure ​Figure11 shows the amplitude variation within seven spectral energy bandwidths in the spoken word “one.” Across the different spectral energy channels, rise times of the amplitude envelopes clearly vary both in the length of the rise time (how long the rise time takes), and in the point when, relative to the onset of the word, there are increases in amplitude that constitute the rise times. Figure 1 Oscillograms of the word “one” spoken by female speaker, and the output of a gammatone filterbank when this signal is passed through (number of channels = 7, ERB = 4.0). The rise times within the different ... The ways that different kinds of spectral energy vary in a sound have specific consequences for the rhythmic phenomena Goswami addresses. A perceptually regular sequence of speech (e.g., someone counting from 1 to 10) has no corresponding physical regularity of the onsets of the sounds: instead, perceptual “moments of occurrence,” or perceptual centers, have been identified as the aspects of sounds – the beats – which are equally timed in both the perception and production of rhythmic speech (Morton et al., 1976). Perceptual centers are associated with increases in mid range spectral energy (around 500–1500 Hz; Marcus, 1981), i.e., with the onsets of the first formants in speech. Perceptual centers are thus linked to the onsets of vowel sounds within syllables (Cummins and Port, 1998; Scott, 1998). The perceptual center of “throw” is so much later than that of “row,” since the onsets of the more sonorous aspects of the spectral energy is much later in “throw.” These differences influence both how talkers time the utterances of “throw” and “row” when speaking rhythmically, and how listeners set those same speech items to a rhythm. Perceptual centers thus allow us to map between the ways that rhythms in sequences are both heard and produced (Morton et al., 1976). Goswami links rise times to the “edges” of auditory objects. However, it is the perceptual center of a syllable, not its edge, which it linked with its onset and rhyme (Marcus, 1981), and it is the beat of a sound, not its edge, which drives its rhythmic properties (Terhardt and Schutte, 1976; Gordon, 1987). We agree with Goswami that it is essential to try and expand on the concept of what “phonological” problems seen in dyslexia truly entail in acoustic terms, and we suggest that the perceptual centers of sounds can capture the properties of auditory objects that rise times alone cannot.

[1]  Robert F. Port,et al.  Rhythmic constraints on stress timing in English , 1998 .

[2]  J. W. Gordon The perceptual attack time of musical tones. , 1987, The Journal of the Acoustical Society of America.

[3]  T. Irino,et al.  A compressive gammachirp auditory filter for both physiological and psychophysical data. , 2001, The Journal of the Acoustical Society of America.

[4]  S. M. Marcus Acoustic determinants of perceptual center (P-center) location , 1981, Perception & psychophysics.

[5]  U. Goswami A temporal sampling framework for developmental dyslexia , 2011, Trends in Cognitive Sciences.