A Lightweight and Efficient Mechanism for Fixing the Synchronization of Misaligned Subtitle Documents

Online subtitle databases allow users to easily find subtitle documents in multiple languages for thousands of films and TV series episodes. However, getting the subtitle document that gives satisfactory synchronization on the first attempt is like hitting the jackpot. The truth is that this process often involves a lot of trial- and-error because multiple versions of subtitle documents have distinct synchronization references, given that they are targeted at variations of the same audiovisual content. Building on our previous efforts to address this problem, in this paper we formalize and validate a two-phase subtitle synchronization framework. The benefit over current approaches lays in the usage of audio fingerprint annotations generated from the base audio signal as second-level synchronization anchors. This way, we allow the media player to dynamically fix during playback the most common cases of subtitle synchronization misalignment that compromise users' watching experience. Results from our evaluation process indicate that our framework has minimal impact on existing subtitle documents and formats as well as on the playback performance.

[1]  Walter Daelemans,et al.  Investigating Lexical Substitution Scoring for Subtitle Generation , 2006, CoNLL.

[2]  Andy Brown,et al.  Dynamic Subtitles: The User Experience , 2015, TVX.

[3]  Pablo César,et al.  An efficient, streamable text format for multimedia captions and subtitles , 2007, DocEng '07.

[4]  Sougo Tsuboi,et al.  Automatically generated captions: will they help non-native speakers communicate in english? , 2010, ICIC '10.

[5]  Rob Miller,et al.  Smart subtitles for vocabulary learning , 2014, CHI.

[6]  Jan-Louis Kruger,et al.  Measuring the impact of subtitles on cognitive load: eye tracking and dynamic audiovisual texts , 2013, ETSA '13.

[7]  D. Bulterman,et al.  SMIL 3.0: Flexible Multimedia for Web, Mobile Devices and Daisy Talking Books , 2004 .

[8]  Vivian Genaro Motti,et al.  A social approach to authoring media annotations , 2010, DocEng '10.

[9]  Jörg Tiedemann Synchronizing Translated Movie Subtitles , 2008, LREC.

[10]  Marco Furini,et al.  An automatic caption alignment mechanism for off-the-shelf speech recognition technologies , 2012, Multimedia Tools and Applications.

[11]  Kunio Kashino,et al.  Visualizing video sounds with sound word animation , 2015, 2015 IEEE International Conference on Multimedia and Expo (ICME).

[12]  Kevin Rooney,et al.  The Impact of Keyword Caption Ratio on Foreign Language Listening Comprehension , 2014, Int. J. Comput. Assist. Lang. Learn. Teach..

[13]  Priscilla Avegliano,et al.  Dynamic Adjustment of Subtitles Using Audio Fingerprints , 2015, ACM Multimedia.

[14]  Walter S. Lasecki,et al.  Captions versus transcripts for online video content , 2013, W4A.

[15]  Gregory H. Wakefield,et al.  Audio thumbnailing of popular music using chroma-based representations , 2005, IEEE Transactions on Multimedia.

[16]  Pablo César,et al.  Creating and sharing personalized time-based annotations of videos on the web , 2010, DocEng '10.

[17]  Cyril Concolato,et al.  Live HTTP streaming of video and subtitles within a browser , 2013, MMSys.

[18]  Meng Wang,et al.  Video accessibility enhancement for hearing-impaired users , 2011, TOMCCAP.

[19]  Rhianne Jones,et al.  Responsive design for personalised subtitles , 2015, W4A.

[20]  Weiqiang Wang,et al.  Robustly Extracting Captions in Videos Based on Stroke-Like Edges and Spatio-Temporal Analysis , 2012, IEEE Transactions on Multimedia.