AP16-OL7: A multilingual database for oriental languages and a language recognition baseline

We present the AP16-OL7 database which was released as the training and test data for the oriental language recognition (OLR) challenge on APSIPA 2016. Based on the database, a baseline system was constructed on the basis of the i-vector model. We report the baseline results evaluated in various metrics defined by the AP16-OLR evaluation plan and demonstrate that AP16-OL7 is a reasonable data resource for multilingual research.

[1]  Jan Cernocký,et al.  Speechdat-e: five eastern european speech databases for voice-operated teleservices completed , 2001, INTERSPEECH.

[2]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[3]  Paul Sidwell,et al.  The Austroasiatic Urheimat: the Southeastern Riverine Hypothesis , 2011 .

[4]  柴谷 方良,et al.  The languages of Japan , 2009 .

[5]  M. Shibatani,et al.  The languages of Japan , 1991 .

[6]  Douglas A. Reynolds,et al.  Language Recognition via i-vectors and Dimensionality Reduction , 2011, INTERSPEECH.

[7]  N. Enfield AREAL LINGUISTICS AND MAINLAND SOUTHEAST ASIA , 2005 .

[8]  Tanja Schultz,et al.  Globalphone: a multilingual speech and text database developed at karlsruhe university , 2002, INTERSPEECH.

[9]  John J. Godfrey Multilingual Speech Databases at LDC , 1994, HLT.

[10]  Huang Xin,et al.  Languages of China , 2008 .

[11]  Einar Meister,et al.  BABEL: an Eastern European multi-language database , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[12]  Fang Chen,et al.  Improvements on hierarchical language identification based on automatic language clustering , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Maria Polinsky,et al.  The Russian Language In The Twentieth Century , 2000 .

[14]  Khalid Choukri,et al.  SPEECHDAT-CAR. A Large Speech Database for Automotive Environments , 2000, LREC.

[15]  Patrick Kenny,et al.  Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.