Auditory Temporal Asymmetry and Autocorrelation

Vowels and musical notes produce complex repeating structures in the neural activity pattern (NAP) flowing from the cochlea. A number of groups have demonstrated that the pitch of these sounds can be extracted by autocorrelating the activity in the individual channels and constructing a multi-channel autocorrelogram (ACG) (e.g. Meddis and Hewitt, 1991; Slaney and Lyon, 1990). Recently, Cariani and Delgutte (1996) showed that the physiological equivalent of the ACG, a multi-channel, all-order interval histogram, is also an excellent predictor of pitch. Several authors have gone farther and argued that the autocorrelograms (ACGs) of speech and musical sounds could also explain vowel quality and musical timbre (e.g. Meddis and Hewitt, 1992). There is a problem, however; the structures produced by natural sounds in the NAP are highly asymmetric. Autocorrelation is symmetric in time and it converts asymmetric NAP structures into symmetric structures in the ACG. Patterson (1994b) and Akeroyd and Patterson (1995) have shown that we are highly sensitive to temporal asymmetry and they have argued that, for timbre analysis at least, autocorrelation (AC) should be replaced with a form of 'strobed' temporal integration (STI) which produces a similar representation but which preserves temporal asymmetry. Section 2 of this paper compares the asymmetry processing of AC to STI. Section 3, introduces a new form of STI that is more like what we might expect to find in the auditory system. It is based on the delta-gamma operator of Irino and Patterson (1996).