Error activity and error entropy as a measure of psychoacoustic significance in the perceptual domain

Several models have been described in the literature which seek to represent audio stimuli in the perceptual domain to best predict the audibility of errors and distortions. By modelling the principal nonlinear processes of human hearing it is possible to calculate a perceptual domain error surface that represents the audible difference between distorted and original audio signals. A further stage of analysis is required to maximise the usefulness of the auditory model output. The audible error surface must be interpreted to produce an estimate of the overall subjective judgement which would result from the particular distortion. Ideally, the interpretation of the error surface should be broadly analogous to human perceptual mechanisms, and equally, it would be desirable to avoid the complex and cumbersome statistical mapping and clustering techniques proposed by some authors. A technique employed in adaptive transform coding of images, namely cell entropy, offered several desired properties. The paper reports the extension and application of such a technique to the interpretation of perceptual-domain error surfaces produced by an auditory model. Speech data were subjected to an example, algorithmically generated, nonlinear distortion and then processed by the auditory model. The usefulness of the error-activity and error-entropy quantities are illustrated, without optimisation, by comparison of model predictions and experimentally determined opinion scores.