Finger tracking: facilitating non-commercial content production for mobile e-reading applications

Limited literacy and visual impairment reduce the ability of many to read on their own. Current e-reader solutions rely on either unnatural synthetic voices or professionally produced audio e-books. Neither provide the same enjoyment as having a family member read to a user, especially when the user requires assistive reading (following printed text while listening to it being read). Unfortunately, the support for non-commercial production of such e-books is limited and requires significant effort. We evaluate a novel, assistive mobile interaction technique that facilitates the recording of audio e-books and their synchronization with the read text. We show that a technique based on a finger tracking metaphor provides optimal support with respect to reading speed. These human-in-the-loop, adaptive techniques can now be used to reduce the content-creation burden that is associated with supporting those who cannot read on their own.

[1]  Joyojeet Pal,et al.  : An Agenda for the ICTD Community , 2022 .

[2]  G. McCalla,et al.  OATS : The Open Annotation and Tagging System , 2006 .

[3]  Natthawut Kertkeidkachorn,et al.  An Automatic Real-time Synchronization of Live speech with Its Transcription Approach , 2015 .

[4]  K. Drager,et al.  Effects of discourse context on the intelligibility of synthesized speech for young adult and older adult listeners: applications for AAC. , 2001, Journal of speech, language, and hearing research : JSLHR.

[5]  Daniel Churchill,et al.  Towards a useful classification of learning objects , 2007 .

[6]  Simon King,et al.  Measuring a decade of progress in Text-to-Speech , 2014 .

[7]  Matthew P. Aylett,et al.  Don't Say Yes, Say Yes: Interacting with Synthetic Speech Using Tonetable , 2016, CHI Extended Abstracts.

[8]  Elisa Rubegni,et al.  Understanding reading experience to inform the design of ebooks for children , 2012, IDC '12.

[9]  Richard K. Wagner,et al.  Vocabulary Acquisition: Implications for Reading Comprehension. , 2006 .

[10]  Joanne F. Carlisle,et al.  The effects of phonological transparency on reading derived words , 2001 .

[11]  Julie Maitland,et al.  Hidden in plain sight: low-literacy adults in a developed country overcoming social and educational challenges through mobile learning support tools , 2013, Personal and Ubiquitous Computing.

[12]  Nadia Caidi,et al.  Social inclusion of newcomers to Canada: An information problem? , 2005 .

[13]  N. Nicholson A Review of Social Isolation: An Important but Underassessed Condition in Older Adults , 2012, The Journal of Primary Prevention.

[14]  Fraser Shein,et al.  Interaction for reading comprehension on mobile devices , 2014, MobileHCI '14.

[15]  Cosmin Munteanu,et al.  Accessible, large-print, listening & talking e-book (ALIT) , 2012, BooksOnline '12.

[16]  Rafael Ballagas,et al.  Hello, is grandma there? let's read! StoryVisit: family video chat and connected e-books , 2011, CHI.

[17]  Eva Siegenthaler,et al.  Improving the Usability of E-Book Readers , 2010 .

[18]  Simon King,et al.  Testing the consistency assumption: Pronunciation variant forced alignment in read and spontaneous speech synthesis , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Ronald Baecker,et al.  An accessible, large-print, listening and talking e-book to support families reading together , 2013, MobileHCI '13.

[20]  Jennifer Pearson,et al.  Co-reading: investigating collaborative group reading , 2012, JCDL '12.

[21]  Barry Smyth,et al.  Trust in recommender systems , 2005, IUI.

[22]  J. I The Design of Experiments , 1936, Nature.

[23]  Hong Ye,et al.  Filled Pause Refinement Based on the Pronunciation Probability for Lecture Speech , 2015, PloS one.

[24]  Demmans Epp,et al.  Supporting English language learners with an adaptive mobile application , 2016 .

[25]  Carrie Demmans Epp Migrants and Mobile Technology Use: Gaps in the Support Provided by Current Tools , 2017 .

[26]  Matt Jones,et al.  Mobility, digital libraries and a rural indian village , 2009, JCDL '09.

[27]  Alexandra Gottardo,et al.  Comparing Factors Related to Reading Comprehension in Adolescents Who Speak English as a First (L1) or Second (L2) Language , 2012 .

[28]  Gökhan Tür,et al.  Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation , 2001, CL.

[29]  Jack Mostow,et al.  Assessing Student Proficiency in a Reading Tutor That Listens , 2003, User Modeling.

[30]  Carsten Frederiksen,et al.  The DAISY Standard: Entering the Global Virtual Library , 2007, Libr. Trends.

[31]  Jordi Luque,et al.  Audio-to-text alignment for speech recognition with very limited resources , 2014, INTERSPEECH.

[32]  David J. Francis,et al.  Accommodations for English Language Learners Taking Large-Scale Assessments: A Meta-Analysis on Effectiveness and Validity , 2009 .

[33]  Daniel Jurafsky,et al.  Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates , 2010, Speech Commun..

[34]  John Sweller,et al.  Cognitive Load Theory , 2020, Encyclopedia of Education and Information Technologies.

[35]  Kei Sawada,et al.  Overview of NITECH HMM-based text-to-speech system for Blizzard Challenge 2014 , 2014 .

[36]  Nicola J. Bidwell,et al.  Designing with mobile digital storytelling in rural Africa , 2010, CHI.

[37]  Kentaro Toyama,et al.  Intermediated technology use in developing communities , 2010, CHI.

[38]  Alex Mihailidis,et al.  Speech recognition in Alzheimer’s disease with personal assistive robots , 2014, SLPAT@ACL.

[39]  Pablo Ruiz,et al.  Long audio alignment for automatic subtitling using different phone-relatedness measures , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[40]  Kavi Mahesh,et al.  Wikiaudia: Crowd-sourcing the Production of Audio and Digital Books , 2015 .

[41]  Marcus Tomalin,et al.  Artificial personality and disfluency , 2015, INTERSPEECH.

[42]  David K. Dickinson,et al.  Handbook of Early Literacy Research. Volume 2. , 2010 .

[43]  Catherine C. Marshall,et al.  Turning the page on navigation , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[44]  João Guerreiro,et al.  Text-to-speeches: evaluating the perception of concurrent speech by blind people , 2014, ASSETS.

[45]  Christian Boitet,et al.  Towards Personal MT: general design, dialogue structure, potential role of speech , 1990, COLING.

[46]  Shirley Ann Becker,et al.  A study of web usability for older adults seeking online health resources , 2004, TCHI.

[47]  Mark J. F. Gales,et al.  Lightly supervised recognition for automatic alignment of large coherent speech recordings , 2010, INTERSPEECH.

[48]  Preeti RAO,et al.  Automatic Assessment of Reading with Speech Recognition Technology , 2016 .

[49]  Suranga Nanayakkara,et al.  FingerReader: A Wearable Device to Explore Printed Text on the Go , 2015, CHI.

[50]  Jennifer Pearson,et al.  The reading desk: applying physical interactions to digital documents , 2011, CHI.

[51]  Xiaojun Bi,et al.  Informal information gathering techniques for active reading , 2012, CHI.

[52]  Jack Mostow,et al.  Why and How Our Automated Reading Tutor Listens , 2012 .

[53]  Ravi Kuber,et al.  An empirical investigation into the difficulties experienced by visually impaired Internet users , 2008, Universal Access in the Information Society.

[54]  Xiang Xiao,et al.  Towards Attentive, Bi-directional MOOC Learning on Mobile Devices , 2015, ICMI.

[55]  Nicole J. Conrad,et al.  A statistical learning perspective on children's learning about graphotactic and morphological regularities in spelling. , 2008 .

[56]  Sharon L. Oviatt,et al.  Advances in Robust Multimodal Interface Design , 2003, IEEE Computer Graphics and Applications.

[57]  International Adult Literacy Survey ( IALS ) , 2018 .

[58]  Yonghong Yan,et al.  An LVCSR Based Automatic Scoring Method in English Reading Tests , 2012, 2012 4th International Conference on Intelligent Human-Machine Systems and Cybernetics.

[59]  Chandra M. Harrison Low-vision reading aids: reading as a pleasurable experience , 2004, Personal and Ubiquitous Computing.

[60]  Joseph E. Beck,et al.  Using Knowledge Tracing in a Noisy Environment to Measure Student Reading Proficiencies , 2006, Int. J. Artif. Intell. Educ..

[61]  Nobuko Osada,et al.  Listening Comprehension Research: A Brief Review of the Past Thirty Years , 2004 .

[62]  Gord McCalla,et al.  The Ecological Approach to the Design of E-Learning Environments: Purpose-based Capture and Use of Information About Learners , 2004 .

[63]  Julie Maitland,et al.  ALEX: mobile language assistant for low-literacy adults , 2010, Mobile HCI.

[64]  Joanne F. Carlisle,et al.  Knowledge of derivational morphology and spelling ability in fourth, sixth, and eighth graders , 1988, Applied Psycholinguistics.