论文信息 - Text based user comments as a signal for automatic language identification of online videos

Text based user comments as a signal for automatic language identification of online videos

Identifying the audio language of online videos is crucial for industrial multi-media applications. Automatic speech recognition systems can potentially detect the language of the audio. However, such systems are not available for all languages. Moreover, background noise, music and multi-party conversations make audio language identification hard. Instead, we utilize text based user comments as a new signal to identify audio language of YouTube videos. First, we detect the language of the text based comments. Augmenting this information with video meta-data features, we predict the language of the videos with an accuracy of 97% on a set of publicly available videos. The subject matter discussed in this research is patent pending.

Sertan Girgin | Reshu Jain | Natalia Ponomareva | A. Seza Dogruöz | Christoph Oehler

[1] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[2] Leo Breiman,et al. Random Forests , 2001, Machine Learning.

[3] Thomas G. Dietterich. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[4] Hervé Bourlard,et al. Hierarchical multilayer perceptron based language identification , 2010, INTERSPEECH.

[5] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[6] Steven Bird,et al. The Human Language Project: Building a Universal Corpus of the World's Languages , 2010, ACL.

[7] Wang Ling,et al. Microblogs as Parallel Corpora , 2013, ACL.

[8] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[9] Pedro J. Moreno,et al. A Real-Time End-to-End Multilingual Speech Recognition Architecture , 2015, IEEE Journal of Selected Topics in Signal Processing.

[10] Rob Malouf,et al. A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[11] Thomas Niesler,et al. Language identification and multilingual speech recognition using discriminatively trained acoustic models , 2006 .

[12] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[13] Bin Ma,et al. Shifted-Delta MLP Features for Spoken Language Recognition , 2013, IEEE Signal Processing Letters.