Automatic Speech Recognition for the Hearing Impaired in an Augmented Reality Application

3 Abstract (in Finnish) 4in Finnish) 4

[1]  Janne Pylkkönen Towards Efficient and Robust Automatic Speech Recognition: Decoding Techniques and Discriminative Training , 2013 .

[2]  Yifan Gong,et al.  An Overview of Noise-Robust Automatic Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3]  Mikko Kurimo,et al.  Morfessor 2.0: Python Implementation and Extensions for Morfessor Baseline , 2013 .

[4]  Yu Zhang,et al.  Very deep convolutional networks for end-to-end speech recognition , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5]  M. Daneman,et al.  How young and old adults listen to and remember speech in noise. , 1995, The Journal of the Acoustical Society of America.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Henry Been-Lirn Duh,et al.  Trends in augmented reality tracking, interaction and display: A review of ten years of ISMAR , 2008, 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality.

[8]  D. Bavelier,et al.  Working Memory, Deafness, and Sign Language , 2010 .

[9]  Ulrich Neumann,et al.  Dynamic registration correction in video-based augmented reality systems , 1995, IEEE Computer Graphics and Applications.

[10]  Mikko Kurimo,et al.  Improved Subword Modeling for WFST-Based Speech Recognition , 2017, INTERSPEECH.

[11]  Dieter Schmalstieg,et al.  Adaptive information density for augmented reality displays , 2016, 2016 IEEE Virtual Reality (VR).

[12]  Zhijian Ou,et al.  A study of large vocabulary speech recognition decoding using finite-state graphs , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.

[13]  S. Kramer Hearing impairment, work, and vocational enablement , 2008, International journal of audiology.

[14]  Vivian Genaro Motti,et al.  Users' Privacy Concerns About Wearables - Impact of Form Factor, Sensors and Type of Data Collected , 2015, Financial Cryptography Workshops.

[15]  Sergei Kochkin,et al.  MarkeTrak VII: Obstacles to adult non‐user adoption of hearing aids , 2007 .

[16]  Borko Furht,et al.  Augmented Reality: An Overview , 2011, Handbook of Augmented Reality.

[17]  John D. Kelleher,et al.  Just Say It: An Evaluation of Speech Interfaces for Augmented Reality Design Applications , 2009, AICS.

[18]  H. G. Lang Higher education for deaf students: research priorities in the new millennium. , 2002, Journal of deaf studies and deaf education.

[19]  Tadayoshi Kohno,et al.  Augmented reality: hard problems of law and policy , 2014, UbiComp Adjunct.

[20]  Krzysztof Marasek,et al.  SPEECON – Speech Databases for Consumer Devices: Database Specification and Validation , 2002, LREC.

[21]  Richard E. Ladner,et al.  ClassInFocus: enabling improved visual attention strategies for deaf and hard of hearing students , 2009, Assets '09.

[22]  Sami Keronen Approaching human performance in noise robust automatic speech recognition , 2014 .

[23]  Jon Peddie,et al.  Augmented Reality: Where We Will All Live , 2017 .

[24]  Mikko Kurimo,et al.  Modeling under-resourced languages for speech recognition , 2017, Lang. Resour. Evaluation.

[25]  Gregory Kramida,et al.  Resolving the Vergence-Accommodation Conflict in Head-Mounted Displays , 2016, IEEE Transactions on Visualization and Computer Graphics.

[26]  Ivan E. Sutherland,et al.  A head-mounted three dimensional display , 1968, AFIPS Fall Joint Computing Conference.

[27]  Alfred Mertins,et al.  Automatic speech recognition and speech variability: A review , 2007, Speech Commun..

[28]  Sean White,et al.  SiteLens: situated visualization techniques for urban site visits , 2009, CHI.

[29]  Mike Wald Captioning for Deaf and Hard of Hearing People by Editing Automatic Speech Recognition in Real Time , 2006, ICCHP.

[30]  Seyed Ghorshi,et al.  Audio-visual speech recognition techniques in augmented reality environments , 2013, The Visual Computer.

[31]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[32]  Luigi Ferrucci,et al.  Hearing loss prevalence and risk factors among older adults in the United States. , 2011, The journals of gerontology. Series A, Biological sciences and medical sciences.

[33]  Dieter Schmalstieg,et al.  Efficient and robust radiance transfer for probeless photorealistic augmented reality , 2014, 2014 IEEE Virtual Reality (VR).

[34]  Thomas Way,et al.  Inclusion of deaf students in computer science classes using real-time speech transcription , 2007, ITiCSE '07.

[35]  Geoffrey Zweig,et al.  Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.

[36]  Steven K. Feiner,et al.  A touring machine: Prototyping 3D mobile augmented reality systems for exploring the urban environment , 1997, Digest of Papers. First International Symposium on Wearable Computers.

[37]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[38]  Nicholas W. D. Evans,et al.  Speaker Diarization: A Review of Recent Research , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  Dieter Schmalstieg,et al.  “Studierstube”: An environment for collaboration in augmented reality , 1998, Virtual Reality.

[40]  J. Lasak,et al.  Hearing loss: diagnosis and management. , 2014, Primary care.

[41]  Meng Wang,et al.  Dynamic captioning: video accessibility enhancement for hearing impairment , 2010, ACM Multimedia.

[42]  Mikko Kurimo,et al.  Studies on Training Text Selection for Conversational Finnish Language Modeling , 2013 .

[43]  Josef Psutka,et al.  Towards live subtitling of TV ice-hockey commentary , 2013, 2013 International Conference on Signal Processing and Multimedia Applications (SIGMAP).

[44]  Dong Yu,et al.  Automatic Speech Recognition: A Deep Learning Approach , 2014 .

[45]  Hannes Kaufmann,et al.  High-quality reflections, refractions, and caustics in Augmented Reality and their contribution to visual coherence , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[46]  Mark J. F. Gales,et al.  The Application of Hidden Markov Models in Speech Recognition , 2007, Found. Trends Signal Process..

[47]  Alexei A. Goon,et al.  A SURVEY OF TRACKING TECHNOLOGY FOR VIRTUAL ENVIRONMENTS , 1999 .

[48]  Heedong Ko,et al.  "Move the couch where?" : developing an augmented reality multimodal interface , 2006, 2006 IEEE/ACM International Symposium on Mixed and Augmented Reality.

[49]  A. R. Mلأller Hearing,anatomy, physiology, and disorders of the auditory system , 2013 .

[50]  Walter S. Lasecki,et al.  Captions versus transcripts for online video content , 2013, W4A.

[51]  Isabelle Guyon,et al.  An Introduction to Feature Extraction , 2006, Feature Extraction.

[52]  Raja S. Kushalnagar,et al.  Multiple view perspectives: improving inclusiveness and video compression in mainstream classroom recordings , 2010, ASSETS '10.

[53]  Tara Matthews,et al.  Scribe4Me: Evaluating a Mobile Sound Transcription Tool for the Deaf , 2006, UbiComp.

[54]  Juri Lukkarila Developing a Conversation Assistant for the Hearing Impaired Using Automatic Speech Recognition , 2017 .

[55]  A. Baddeley The episodic buffer: a new component of working memory? , 2000, Trends in Cognitive Sciences.

[56]  Luigi Ferrucci,et al.  Hearing loss prevalence in the United States. , 2011, Archives of internal medicine.

[57]  M Sorri,et al.  Do we know the real need for hearing rehabilitation at the population level? Hearing impairments in the 5- to 75-year-old cross-sectional Finnish population. , 1999, British journal of audiology.

[58]  Yajie Miao,et al.  EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[59]  B. Moore Cochlear hearing loss : physiological, psychological and technical issues , 2014 .

[60]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[61]  Seyed Ghorshi,et al.  Combining Augmented Reality and Speech Technologies to Help Deaf and Hard of Hearing People , 2012, 2012 14th Symposium on Virtual and Augmented Reality.

[62]  J. Paul Robinson,et al.  Using speech recognition for real-time captioning and lecture transcription in the classroom , 2013, IEEE Transactions on Learning Technologies.

[63]  Ernesto Damiani,et al.  Augmented reality technologies, systems and applications , 2010, Multimedia Tools and Applications.

[64]  Keiichi Zempo,et al.  Caption support system for complementary dialogical information using see-through head mounted display , 2015, 2015 IEEE 4th Global Conference on Consumer Electronics (GCCE).

[65]  P. Milgram,et al.  A Taxonomy of Mixed Reality Visual Displays , 1994 .

[66]  Qian-Jie Fu,et al.  Noise Susceptibility of Cochlear Implant Users: The Role of Spectral Resolution and Smearing , 2005, Journal of the Association for Research in Otolaryngology.

[67]  Mary R. Power,et al.  Everyone here speaks TXT: deaf people using SMS in Australia and the rest of the world. , 2004, Journal of deaf studies and deaf education.

[68]  Ian P. Howard,et al.  Binocular Vision and Stereopsis , 1996 .

[69]  Luigi Ferrucci,et al.  Hearing loss and cognitive decline in older adults: questions and answers , 2014, Aging Clinical and Experimental Research.

[70]  Hong Chen,et al.  Observing a volume rendered fetus within a pregnant patient , 1994, Proceedings Visualization '94.

[71]  D. W. F. van Krevelen,et al.  A Survey of Augmented Reality Technologies, Applications and Limitations , 2010, Int. J. Virtual Real..

[72]  Douglas D. O'Shaughnessy,et al.  Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition , 1999, IEEE Trans. Speech Audio Process..

[73]  Kangsoo Kim,et al.  Revisiting Trends in Augmented Reality Research: A Review of the 2nd Decade of ISMAR (2008–2017) , 2018, IEEE Transactions on Visualization and Computer Graphics.

[74]  Pan Hui,et al.  Mobile Augmented Reality Survey: From Where We Are to Where We Go , 2017, IEEE Access.

[75]  Tanel Alumäe,et al.  Full-duplex Speech-to-text System for Estonian , 2014, Baltic HLT.

[76]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[77]  Morton Leonard Heilig,et al.  EL Cine del Futuro: The Cinema of the Future , 1992, Presence: Teleoperators & Virtual Environments.

[78]  Poorna Kushalnagar,et al.  Collaborative Gaze Cues and Replay for Deaf and Hard of Hearing Students , 2014, ICCHP.

[79]  Katashi Nagao,et al.  The world through the computer: computer augmented interaction with real world environments , 1995, UIST '95.

[80]  Springer-Verlag London Limited An augmented reality interface to contextual information , 2011 .

[81]  Ramesh Raskar,et al.  Modern approaches to augmented reality , 2005, SIGGRAPH Courses.

[82]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[83]  Soh-Khim Ong,et al.  Virtual and Augmented Reality Applications in Manufacturing , 2004, MIM.

[84]  A. Needleman,et al.  Speech recognition in noise by hearing-impaired and noise-masked normal-hearing listeners. , 1995, Journal of the American Academy of Audiology.

[85]  M. Järvelin,et al.  Effect of hearing impairment on educational outcomes and employment up to the age of 25 years in northern Finland. , 1997, British journal of audiology.

[86]  Eric Rescorla,et al.  The Transport Layer Security (TLS) Protocol Version 1.2 , 2008, RFC.

[87]  J. Mills,et al.  Presbycusis , 2005, The Lancet.

[88]  Bruce H. Thomas,et al.  A wearable computer system with augmented reality to support terrestrial navigation , 1998, Digest of Papers. Second International Symposium on Wearable Computers (Cat. No.98EX215).

[89]  E. Platz,et al.  Prevalence of hearing loss and differences by demographic characteristics among US adults: data from the National Health and Nutrition Examination Survey, 1999-2004. , 2008, Archives of internal medicine.

[90]  Alexey Melnikov,et al.  The WebSocket Protocol , 2011, RFC.

[91]  T. P. Caudell,et al.  Augmented reality: an application of heads-up display technology to manual manufacturing processes , 1992, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences.

[92]  J. Pickles An Introduction to the Physiology of Hearing , 1982 .

[93]  J. R. Holt,et al.  Generation of inner ear organoids with functional hair cells from human pluripotent stem cells , 2017, Nature Biotechnology.

[94]  Mikko Kurimo,et al.  Automatic Speech Recognition With Very Large Conversational Finnish and Estonian Vocabularies , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[95]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[96]  Mikko Kurimo,et al.  Automatic Construction of the Finnish Parliament Speech Corpus , 2017, INTERSPEECH.

[97]  Ian McGraw,et al.  Personalized speech recognition on mobile devices , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[98]  Ronald Azuma,et al.  A Survey of Augmented Reality , 1997, Presence: Teleoperators & Virtual Environments.

[99]  Heather Fortnum,et al.  Why do people fitted with hearing aids not wear them? , 2013, International journal of audiology.

[100]  Stig Arlinger,et al.  Negative consequences of uncorrected hearing loss—a review , 2003, International journal of audiology.

[101]  Ivan E. Sutherland,et al.  The Ultimate Display , 1965 .

[102]  Nelson Cowan,et al.  The many faces of working memory and short-term storage , 2017, Psychonomic bulletin & review.

[103]  Mel Slater,et al.  A note on presence terminology , 2003 .

[104]  Q. Huang,et al.  Age-related hearing loss or presbycusis , 2010, European Archives of Oto-Rhino-Laryngology.

[105]  Douglas A. Reynolds,et al.  Measuring the readability of automatic speech-to-text transcripts , 2003, INTERSPEECH.

[106]  Ronald Klein,et al.  The impact of hearing loss on quality of life in older adults. , 2003, The Gerontologist.

[107]  David W. Murray,et al.  Simulating Low-Cost Cameras for Augmented Reality Compositing , 2010, IEEE Transactions on Visualization and Computer Graphics.

[108]  Dennis A. Vincenzi,et al.  The Effect of Apparent Latency on Simulator Sickness While Using a See-Through Helmet-Mounted Display , 2012, Hum. Factors.

[109]  Steven K. Feiner,et al.  Perceptual issues in augmented reality revisited , 2010, 2010 IEEE International Symposium on Mixed and Augmented Reality.

[110]  Jeff B Pelz,et al.  Classroom Interpreting and Visual Information Processing in Mainstream Education for Deaf Students: Live or Memorex®? , 2005, American educational research journal.

[111]  Hanseok Ko,et al.  Dialogue enabling speech-to-text user assistive agent with auditory perceptual beamforming for hearing-impaired , 2013, 2013 IEEE International Conference on Consumer Electronics (ICCE).

[112]  Hirokazu Kato,et al.  Marker tracking and HMD calibration for a video-based augmented reality conferencing system , 1999, Proceedings 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR'99).

[113]  Ronald Azuma,et al.  Recent Advances in Augmented Reality , 2001, IEEE Computer Graphics and Applications.

[114]  M. D’Esposito Working memory. , 2008, Handbook of clinical neurology.

[115]  J. Rolland,et al.  Head-worn displays: a review , 2006, Journal of Display Technology.