Automatic Multilingual System from Speech

Language recognition is the way by which the language of a digital speech utterance is recognized automatically by a computer. Commenced Language Identification Systems sequentially transform the speech signal into discrete units, and then apply statistical methods on the resultant units to extract their language information. Today, a large number of audio retrieval features exists for automatic speech and language recognition. The proposed method has nominated an automatic system for well-known multi-languages. The identification has been done using a new set of audio features. The suitable feature has been adopted. This includes Zero-Crossing Rate, Spectral Flux, Pitch, Mel-frequency Cepstral Coefficients, Tempo, and Short-Time Energy. These features have been used exclusively for identifying the language along with the help of classifiers and feature selection algorithms.

[1]  Douglas E. Sturim,et al.  Language Recognition via Sparse Coding , 2016, INTERSPEECH.

[2]  Petri Toiviainen,et al.  MIR in Matlab (II): A Toolbox for Musical Feature Extraction from Audio , 2007, ISMIR.

[3]  Vishal Gupta,et al.  A Survey of Language Identification Techniques and Applications , 2014 .

[4]  Hynek Hermansky,et al.  Segmentation of speech for speaker and language recognition , 2003, INTERSPEECH.

[5]  E Chandra,et al.  A Review on Automatic Speech Recognition Architecture and Approaches , 2016 .

[6]  Andreas Nürnberger,et al.  A Comparative Study on Language Identification Methods , 2008, LREC.

[7]  John H. L. Hansen,et al.  UTD-CRSS system for the NIST 2015 language recognition i-vector machine learning challenge , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  David A. Ross,et al.  Automatic Language Identification in music videos with low level audio and visual features , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[10]  Tomi Kinnunen,et al.  Out-of-Set i-Vector Selection for Open-set Language Identification , 2016, Odyssey.

[11]  Douglas A. Reynolds,et al.  Language identification using Gaussian mixture model tokenization , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Tommi Vatanen,et al.  Language Identification of Short Text Segments with N-gram Models , 2010, LREC.

[13]  Douglas A. Reynolds,et al.  Language Recognition via i-vectors and Dimensionality Reduction , 2011, INTERSPEECH.