On Some Biases Encountered in Modern Audio Quality Listening Tests-A Review

A systematic review of typical biases encountered in modern audio quality listening tests is presented. The following three types of bias are discussed in more detail: bias due to affective judgments, response mapping bias, and interface bias. In addition, a potential bias due to perceptually nonlinear graphic scales is discussed. A number of recommendations aiming to reduce the aforementioned biases are provided, including an in-depth discussion of direct and indirect anchoring techniques.

[1]  Jyri Huopaniemi,et al.  Results of a Round Robin Subjective Evaluation of Virtual Home Theatre Sound Systems , 1999 .

[2]  A. Parducci Chapter 5 – CONTEXTUAL EFFECTS: A RANGE–FREQUENCY ANALYSIS* , 1974 .

[3]  Lars Bramsløw An objective estimate of the perceived quality of reproduced sound in normal and impaired hearing , 2004 .

[4]  Herbert Stone,et al.  Sensory Evaluation Practices , 1985 .

[5]  Michel C. Lavoie,et al.  Subjective Evaluation of Large and Small Impairments in Audio Codecs , 1999 .

[6]  S. S. Stevens,et al.  Handbook of experimental psychology , 1951 .

[7]  Francis Rumsey,et al.  Computer Games and Multichannel Audio Quality - The Effect of Division of Attention between Auditory and Visual Modalities , 2003 .

[8]  Sylvain Choisel,et al.  Evaluation of multichannel reproduced sound: scaling auditory attributes underlying listener preference. , 2007, The Journal of the Acoustical Society of America.

[9]  Mohammed Ghanbari,et al.  Recency effect in the subjective assessment of digitally-coded television pictures , 1995 .

[10]  Lew B. Stelmach,et al.  All subjective scales are not created equal: The effects of context on different scales , 1999, Signal Process..

[11]  Sugato Chakravarty,et al.  Method for the subjective assessment of intermedi-ate quality levels of coding systems , 2001 .

[12]  Methods for the subjective assessment of small impairments in audio systems , 2015 .

[13]  David Hands Multimodal quality perception: the effects of attending to content on subjective quality ratings , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[14]  Lars B. Nielsen Subjective Assessment of Codecs and Bitrates for Broadcast Purposes , 1996 .

[15]  Hugo Fastl,et al.  Psycho-Acoustics and Sound Quality , 2005 .

[16]  Floyd E. Toole,et al.  Listening Tests-Turning Opinion into Fact , 1981 .

[17]  N. Anderson Chapter 8 – ALGEBRAIC MODELS IN PERCEPTION* , 1974 .

[18]  Mohammed Ghanbari,et al.  Forgiveness effect in subjective assessment of packet video , 1992 .

[19]  Harry T. Lawless,et al.  Sensory Evaluation of Food , 1999 .

[20]  Roger E. Kirk Learning, a Major Factor Influencing Preferences for High‐Fidelity Reproducing Systems , 1956 .

[21]  Francis Rumsey,et al.  Effects of Bandwidth Limitation on Audio Quality in Consumer Multichannel Audiovisual Delivery Systems , 2003 .

[22]  Slawomir Zielinski On Some Biases Encountered in Modern Listening Tests , 2006 .

[23]  John Vanderkooy,et al.  The Great Debate: Subjective Evaluation , 1980 .

[24]  Francis Rumsey,et al.  Potential Biases in MUSHRA Listening Tests , 2007 .

[25]  de H Huib Ridder,et al.  Continuous assessment of image quality , 1997 .

[26]  Harry T. Lawless,et al.  Sensory Evaluation of Food: Principles and Practices , 1998 .

[27]  Floyd E. Toole Subjective Measurements of Loudspeaker Sound Quality and Listener Performance , 1985 .

[28]  Wolfgang Ellermeier,et al.  Using probabilistic choice models to investigate auditory unpleasantness , 2004 .

[29]  J. Harris,et al.  The locus of short duration auditory fatigue or "adaptation". , 1953, Journal of experimental psychology.

[30]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[31]  E. Köster The psychology of food choice: some often encountered fallacies , 2003 .

[32]  Francis Rumsey,et al.  Contextual Effects on Sound Quality Judgements: Listening Room and Automotive Environments , 2006 .

[33]  METHODS FOR SUBJECTIVE DETERMINATION OF TRANSMISSION QUALITY Summary , 2022 .

[34]  Ag Armin Kohlrausch,et al.  Audio—Visual Interaction in the Context of Multi-Media Applications , 2005 .

[35]  Kristine Frisenfeldt Thuesen Analysis of Ranked Preference Data , 2007 .

[36]  S. Link,et al.  Bias in Quantifying Judgments , 1989 .

[37]  Anna Watson,et al.  Assessing the quality of audio and video components in desktop multimedia conferencing , 2001 .

[38]  David Clark,et al.  High Resolution Subjective Testing Using a Double Blind Comparator , 1981 .

[39]  Ruth A Bentler,et al.  Impact of Digital Labeling on Outcome Measures , 2003, Ear and hearing.

[40]  Stanley P. Lipshitz The Great Debate: Some Reflections Ten Years Later , 1990 .

[41]  Francis Rumsey,et al.  Contextual Effects on Sound Quality Judgements: Part II – Multi-Stimulus vs. Single Stimulus Method , 2006 .

[42]  Nagato Narita Graphic Scaling and Validity of Japanese Descriptive Terms Used in Subjective-Evaluation Tests , 1993 .

[43]  Francis Rumsey Controlled Subjective Assessments of Two-to-Five-Channel Surround Sound Processing Algorithms , 1999 .

[44]  Daniel Västfjäll Contextual influences on sound quality evaluation , 2004 .

[45]  Tobias Neher,et al.  Towards a spatial ear trainer. , 2004 .

[46]  Floyd E. Toole,et al.  Loudspeakers and Rooms for Sound Reproduction-A Scientific Review , 2006 .

[47]  M. P. Friedman,et al.  HANDBOOK OF PERCEPTION , 1977 .

[48]  Sylvain Busson,et al.  Effects of context on the subjective assessment of time-varying speech quality : Listening / conversation, laboratory / real environment , 2004 .

[49]  D. C. Howell Statistical Methods for Psychology , 1987 .

[50]  Peter Nol High Quality Audio Coding , 2002 .

[51]  Jesper Jensen,et al.  Bit-Rate Scalable Intraframe Sinusoidal Audio Coding Based on Rate-Distortion Optimization , 2006 .

[52]  Ralph Sperschneider Error Resilient Source Coding with Variable Length Codes and Its Application to MPEG Advanced Audio Coding , 2000 .

[53]  Sean Olive,et al.  Hearing is Believing vs. Believing is Hearing: Blind vs. Sighted Listening Tests, and Other Interesting Things , 1994 .

[54]  M H Birnbaum,et al.  Loci of contextual effects in judgment. , 1982, Journal of experimental psychology. Human perception and performance.

[55]  Morten Meilgaard,et al.  Sensory Evaluation Techniques , 2020 .

[56]  Sylvain Choisel,et al.  Spatial aspects of sound quality: subjective assessment of sound reproduced by stereo and by multichannel systems , 2006 .

[57]  Kees Teunissen The Validity of CCIR Quality Indicators Along a Graphical Scale , 1996 .

[58]  Søren Bech Selection and Training of Subjects for Listening Tests on Sound-Reproducing Equipment , 1992 .

[59]  ITU-T Rec. P.910 (04/2008) Subjective video quality assessment methods for multimedia applications , 2009 .

[60]  Bronwen L. Jones,et al.  Graphic scaling of qualitative terms , 1986 .

[61]  Sugato Chakravarty,et al.  Methodology for the subjective assessment of the quality of television pictures , 1995 .

[62]  H. Helson,et al.  Adaptation-level theory , 1964 .

[63]  Francis Rumsey,et al.  On the Use of Graphic Scales in Modern Listening Tests , 2007 .

[64]  Wolfgang Ellermeier,et al.  Deriving ratio-scale measures of sound quality from preference judgments , 2003 .

[65]  Florian Wickelmaier,et al.  Perceptual Audio Evaluation - Theory, Method and Application , 2006 .

[66]  Louis Narens,et al.  A theory of ratio magnitude estimation , 1996 .

[67]  Alan C. Bovik,et al.  41 OBJECTIVE VIDEO QUALITY ASSESSMENT , 2003 .

[68]  D. Levitin,et al.  Ecological validity of soundscape reproduction , 2004 .

[69]  Canada,et al.  STATISTICAL METHODS FOR , 2004 .