Personalization in Object-based Audio for Accessibility: A Review of Advancements for Hearing Impaired Listeners

Hearing loss is widespread and significantly impacts an individual’s ability to engage with broadcast media. Access can be improved through new object-based audio personalization methods. Utilizing the literature on hearing loss and intelligibility this paper develops three dimensions which are evidenced to improve intelligibility: spatial separation, speech to noise ratio and redundancy. These can be personalized, individually or concurrently, using object based audio. A systematic review of all work in object-based audio personalization is then undertaken. These dimensions are utilized to evaluate each project’s approach to personalisation, identifying successful approaches, commercial challenges and the next steps required to ensure continuing improvements to broadcast audio for hard of hearing individuals.

[1]  S Sadhra,et al.  Noise exposure and hearing loss among student employees working in university entertainment venues. , 2002, The Annals of occupational hygiene.

[2]  V. Woisard,et al.  Relationship Between Speech Intelligibility and Speech Comprehension in Babble Noise. , 2015, Journal of speech, language, and hearing research : JSLHR.

[3]  S. Soli,et al.  Development of the Hearing in Noise Test for the measurement of speech reception thresholds in quiet and in noise. , 1994, The Journal of the Acoustical Society of America.

[4]  L L Elliott,et al.  Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. , 1977, The Journal of the Acoustical Society of America.

[5]  B. Moore,et al.  Hearing-aid use and its determinants in the UK National Health Service: A cross-sectional study at the Royal Surrey County Hospital , 2015, International journal of audiology.

[6]  Jon Francombe,et al.  Accessible Object-Based Audio Using Hierarchical Narrative Importance Metadata , 2018 .

[7]  L. Rabiner,et al.  Binaural release from masking for speech and gain in intelligibility. , 1967, The Journal of the Acoustical Society of America.

[8]  Jon Francombe,et al.  A Quantitative Evaluation of Media Device Orchestration for Immersive Spatial Audio Reproduction , 2018 .

[9]  Virginie Laval,et al.  Understanding expressive speech acts: the role of prosody and situational context in French-speaking 5- to 9-year-olds. , 2010, Journal of speech, language, and hearing research : JSLHR.

[10]  Kelly L Tremblay,et al.  Aging degrades the neural encoding of simple and complex sounds in the human brainstem. , 2013, Journal of the American Academy of Audiology.

[11]  M. Huckvale,et al.  On the Predictability of the Intelligibility of Speech to Hearing Impaired Listeners , 2017 .

[12]  Lauren A. Ward Accessible Broadcast Audio Personalisation for Hard of Hearing Listeners , 2017, TVX.

[13]  D. Sen,et al.  MPEG-H 3D Audio - The Next Generation Audio System , 2014 .

[14]  J. Mills,et al.  Presbycusis , 2005, The Lancet.

[15]  Richard Kronland-Martinet,et al.  Sound Categorization and Conceptual Priming for Nonlinguistic and Linguistic Sounds , 2010, Journal of Cognitive Neuroscience.

[16]  Christopher J. Plack,et al.  Perceptual Consequences of “Hidden” Hearing Loss , 2014, Trends in hearing.

[17]  C. Pike,et al.  Delivering Object-Based 3 D Audio Using The Web Audio API And The Audio Definition Model , 2015 .

[18]  Rudolf Probst,et al.  Prevalence of age-related hearing loss in Europe: a review , 2011, European Archives of Oto-Rhino-Laryngology.

[19]  R Plomp,et al.  Auditory handicap of hearing impairment and the limited benefit of hearing aids. , 1978, The Journal of the Acoustical Society of America.

[20]  Brian C J Moore,et al.  The contribution of temporal fine structure to the intelligibility of speech in steady and modulated noise. , 2009, The Journal of the Acoustical Society of America.

[21]  B. Lindblom On the communication process: Speaker-listener interaction and the development of speech* , 1990 .

[22]  Frank Melchior,et al.  Creating object-based experiences in the real world , 2016 .

[23]  Mike Armstrong,et al.  Moving Object-Based Media Production from One-Off Examples to Scalable Workflows , 2018 .

[24]  Paul Kendrick,et al.  The clean audio project: Digital TV as assistive technology , 2006 .

[25]  Katie Ellis Television's Transition to the Internet: Disability Accessibility and Broadband-Based TV in Australia , 2014 .

[26]  Emily Atkinson,et al.  Music interferes with learning from television during infancy , 2010 .

[27]  Tomas Bäckström,et al.  An evaluation of stereo speech enhancement methods for different audio-visual scenarios , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[28]  Benjamin M. Shirley,et al.  Intelligibility versus comprehension: understanding quality of accessible next-generation audio broadcast , 2020, Universal Access in the Information Society.

[29]  Yan Tang,et al.  The Effect of Situation-Specific Non-Speech Acoustic Cues on the Intelligibility of Speech in Noise , 2017, INTERSPEECH.

[30]  E. Platz,et al.  Prevalence of hearing loss and differences by demographic characteristics among US adults: data from the National Health and Nutrition Examination Survey, 1999-2004. , 2008, Archives of internal medicine.

[31]  Nancy Tye-Murray,et al.  Effects of Context Type on Lipreading and Listening Performance and Implications for Sentence Processing. , 2015, Journal of speech, language, and hearing research : JSLHR.

[32]  Colin Mathers,et al.  A study of sound balances for the hard of hearing , 1991 .

[33]  Paul Kendrick,et al.  The Effect of Stereo Crosstalk on Intelligibility: Comparison of a Phantom Stereo Image and a Central Loudspeaker Source , 2007 .

[34]  D. McAlpine,et al.  Tinnitus with a Normal Audiogram: Physiological Evidence for Hidden Hearing Loss and Computational Model , 2011, The Journal of Neuroscience.

[35]  Kevin Wilson,et al.  Looking to listen at the cocktail party , 2018, ACM Trans. Graph..

[36]  M. Daneman,et al.  How young and old adults listen to and remember speech in noise. , 1995, The Journal of the Acoustical Society of America.

[37]  Jean-Marc Jot,et al.  Dialog Control and Enhancement in Object-Based Audio Systems , 2015 .

[38]  H. Fuchs,et al.  Advanced clean audio solution: dialogue enhancement , 2013 .

[39]  Access Economics Listen hear! the economic impact and cost of hearing loss in Australia , 2006 .

[40]  M J Griffin,et al.  Occupational exposure to noise and the attributable burden of hearing difficulties in Great Britain , 2002, Occupational and environmental medicine.

[41]  Y. Cohen,et al.  The what, where and how of auditory-object perception , 2013, Nature Reviews Neuroscience.

[42]  Andy Brown,et al.  Understanding the Diverse Needs of Subtitle Users in a Rapidly Evolving Media Landscape , 2016 .

[43]  Sigfrid D Soli,et al.  The relationship between high-frequency pure-tone hearing loss, hearing in noise test (HINT) thresholds, and the articulation index. , 2012, Journal of the American Academy of Audiology.

[44]  Beverly A Wright,et al.  Auditory filter shapes and high-frequency hearing in adults who have impaired speech in noise performance despite clinically normal audiograms. , 2011, The Journal of the Acoustical Society of America.

[45]  Frank Melchior,et al.  Object-based audio applied to football broadcasts , 2013, ImmersiveMe '13.

[46]  Jon Barker,et al.  An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.

[47]  Anna Lawson,et al.  United Nations Convention on the Rights of Persons with Disabilities (CRPD) , 2018 .

[48]  Robert L. Bleidt,et al.  Object-Based Audio : Opportunities for Improved Listening Experience and Increased Listener Involvement , 2014 .

[49]  Peter Poers Challenging Changes for Live NGA Immersive Audio Production , 2018 .

[50]  E Villchur,et al.  Simulation of the effect of recruitment on loudness relationships in speech. , 1974, The Journal of the Acoustical Society of America.

[51]  A. R. Carmichael Evaluating digital “on-line” background noise suppression: Clarifying television dialogue for older, hard-of-hearing viewers , 2004 .

[52]  Hannes Müsch,et al.  Aging and Sound Perception: Desirable Characteristics of Entertainment Audio for the Elderly , 2008 .

[53]  D J Schum,et al.  SPIN test performance of elderly hearing-impaired listeners. , 1992, Journal of the American Academy of Audiology.

[54]  Ingrid S. Johnsrude,et al.  Behavioral and fMRI evidence that cognitive ability modulates the effect of semantic context on speech intelligibility , 2012, Brain and Language.

[55]  Frank Melchior,et al.  Categorization of broadcast audio objects in complex auditory scenes , 2016 .

[56]  Cisco Visual Networking Index: Forecast and Methodology 2016-2021.(2017) http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual- networking-index-vni/complete-white-paper-c11-481360.html. High Efficiency Video Coding (HEVC) Algorithms and Architectures https://jvet.hhi.fraunhofer. , 2017 .

[57]  Rachel A McArdle,et al.  Speech recognition in multitalker babble using digits, words, and sentences. , 2005, Journal of the American Academy of Audiology.

[58]  James Woodcock,et al.  Personalized Object-Based Audio for Hearing Impaired TV Viewers , 2017 .

[59]  G. Kidd,et al.  The effect of spatial separation on informational masking of speech in normal-hearing and hearing-impaired listeners. , 2005, The Journal of the Acoustical Society of America.

[60]  Frank Melchior,et al.  Exploring object-based content adaptation for mobile audio , 2018, Personal and Ubiquitous Computing.

[61]  R. Mayer,et al.  A coherence effect in multimedia learning: The case for minimizing irrelevant sounds in the design of multimedia instructional messages. , 2000 .

[62]  Richard L Freyman,et al.  Effect of number of masking talkers and auditory priming on informational masking in speech recognition. , 2004, The Journal of the Acoustical Society of America.

[63]  Pamela Souza,et al.  The advantage of knowing the talker. , 2013, Journal of the American Academy of Audiology.

[64]  Great Britain. Foreign Office. Broadcasting : copy of the Royal Charter for the continuance of the British Broadcasting Corporation , 1964 .

[65]  M. Spreng,et al.  Ear damage caused by leisure noise. , 2001, Noise & health.

[66]  Adrian Murtaza,et al.  MPEG-D Spatial Audio Object Coding for Dialogue Enhancement (SAOC-DE) , 2015 .

[67]  Nicolas Tsingos,et al.  Immersive and Personalized Audio: A Practical System for Enabling Interchange, Distribution, and Delivery of Next-Generation Audio Experiences , 2014 .

[68]  S. Coren Most comfortable listening level as a function of age. , 1994, Ergonomics.

[69]  Jürgen Herre,et al.  The Adjustment/Satisfaction Test (A/ST) for the Evaluation of Personalization in Broadcast Services and Its Application to Dialogue Enhancement , 2018, IEEE Transactions on Broadcasting.

[70]  Ben Shirley,et al.  Clean Audio for TV broadcast: an object-based approach for hearing impaired viewers , 2015 .

[71]  Takashi,et al.  RESOLUTION , 2009, Bring Now the Angels.

[72]  Rajiv Ramdhany,et al.  Enabling Frame-Accurate Synchronised Companion Screen Experiences , 2016, TVX.

[73]  H. Potter,et al.  ENHANCING AUDIO DESCRIPTION: SOUND DESIGN, SPATIALISATION AND ACCESSIBILITY IN FILM AND TELEVISION , 2016 .

[74]  Peter Mapp Intelligibility of Cinema & TV Sound Dialogue , 2016 .

[75]  Jordan Cheer,et al.  Time Domain Optimization of Filters Used in a Loudspeaker Array for Personal Audio , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[76]  L E Humes,et al.  Factors associated with individual differences in clinical measures of speech recognition among the elderly. , 1994, Journal of speech and hearing research.

[77]  Samantha Sharpe,et al.  Action on hearing loss , 2015 .

[78]  John A. Albertini,et al.  Deafness and Hearing Loss , 2010 .

[79]  Jouni Paulus,et al.  The Adjustment / Satisfaction Test (A/ST) for the Subjective Evaluation of Dialogue Enhancement , 2017 .

[80]  Bruno Fazenda,et al.  Automatic speech-to-background ratio selection to maintain speech intelligibility in broadcasts using an objective intelligibility metric , 2018 .

[81]  Jerker Rönnberg,et al.  The Influence of Semantically Related and Unrelated Text Cues on the Intelligibility of Sentences in Noise , 2011, Ear and hearing.

[82]  Kelly L Watts,et al.  The Revised Speech Perception in Noise Test (R-SPIN) in a multiple signal-to-noise ratio paradigm. , 2012, Journal of the American Academy of Audiology.

[83]  D. Monzani,et al.  Psychological profile and social behaviour of working adults with mild or moderate hearing loss. , 2008, Acta otorhinolaryngologica Italica : organo ufficiale della Societa italiana di otorinolaringologia e chirurgia cervico-facciale.

[84]  A. Wingfield,et al.  Hearing Loss in Older Adulthood , 2005 .

[85]  Mario Montagud,et al.  ImAc: Enabling Immersive, Accessible and Personalized Media Experiences , 2018, TVX.

[86]  R. P. Carlyon,et al.  Subcortical Neural Synchrony and Absolute Thresholds Predict Frequency Discrimination Independently , 2013, Journal of the Association for Research in Otolaryngology.

[87]  Andreas Niedermeier,et al.  Development of the MPEG-H TV Audio System for ATSC 3.0 , 2017, IEEE Transactions on Broadcasting.

[88]  Benjamin Guy Shirley,et al.  Improving television sound for people with hearing impairments , 2013 .

[89]  Mirjam Ernestus,et al.  Articulatory Planning Is Continuous and Sensitive to Informational Redundancy , 2005, Phonetica.

[90]  Bruno Fazenda,et al.  Speech-to-screen : spatial separation of dialogue from noise towards improved speech intelligibility for the small screen , 2018 .

[91]  Olaf Strelcyk,et al.  TV listening and hearing aids , 2018, PloS one.

[93]  Guy Van Camp,et al.  Deafness and Hereditary Hearing Loss Overview , 2014 .

[94]  B. Moore,et al.  Age-group differences in speech identification despite matched audiometrically normal hearing: contributions from auditory temporal processing and cognition , 2015, Front. Aging Neurosci..

[95]  Matthew Paradis Adaptive , Personalised “ in browser ” Audio Compression , 2015 .

[96]  Frank Melchior,et al.  Does Environmental Noise Influence Preference of Background-Foreground Audio Balance? , 2016 .

[97]  Harald Fuchs,et al.  Dialogue Enhancements-technology and experiments , 2009 .

[98]  Thomas Ziegler,et al.  Personalized and Immersive Broadcast Audio , 2015 .

[99]  Barbara Shinn-Cunningham,et al.  Spatial Selective Auditory Attention in the Presence of Reverberant Energy: Individual Differences in Normal-Hearing Listeners , 2011, Journal of the Association for Research in Otolaryngology.

[100]  J. Dubno,et al.  Effects of age and mild hearing loss on speech recognition in noise. , 1984, The Journal of the Acoustical Society of America.

[101]  Nao Hodoshima Effects of Urgent Speech and Preceding Sounds on Speech Intelligibility in Noisy and Reverberant Environments , 2016, INTERSPEECH.

[102]  Ira J. Hirsh,et al.  The Relation between Localization and Intelligibility , 1950 .

[103]  N. Miller Measuring up to speech intelligibility. , 2013, International journal of language & communication disorders.