论文信息 - AP16-OL7: A multilingual database for oriental languages and a language recognition baseline

AP16-OL7: A multilingual database for oriental languages and a language recognition baseline

We present the AP16-OL7 database which was released as the training and test data for the oriental language recognition (OLR) challenge on APSIPA 2016. Based on the database, a baseline system was constructed on the basis of the i-vector model. We report the baseline results evaluated in various metrics defined by the AP16-OLR evaluation plan and demonstrate that AP16-OL7 is a reasonable data resource for multilingual research.

[1] Jan Cernocký,et al. Speechdat-e: five eastern european speech databases for voice-operated teleservices completed , 2001, INTERSPEECH.

[2] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[3] Paul Sidwell,et al. The Austroasiatic Urheimat: the Southeastern Riverine Hypothesis , 2011 .

[4] 柴谷方良,et al. The languages of Japan , 2009 .

[5] M. Shibatani,et al. The languages of Japan , 1991 .

[6] Douglas A. Reynolds,et al. Language Recognition via i-vectors and Dimensionality Reduction , 2011, INTERSPEECH.

[7] N. Enfield. AREAL LINGUISTICS AND MAINLAND SOUTHEAST ASIA , 2005 .

[8] Tanja Schultz,et al. Globalphone: a multilingual speech and text database developed at karlsruhe university , 2002, INTERSPEECH.

[9] John J. Godfrey. Multilingual Speech Databases at LDC , 1994, HLT.

[10] Huang Xin,et al. Languages of China , 2008 .

[11] Einar Meister,et al. BABEL: an Eastern European multi-language database , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[12] Fang Chen,et al. Improvements on hierarchical language identification based on automatic language clustering , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[13] Maria Polinsky,et al. The Russian Language In The Twentieth Century , 2000 .

[14] Khalid Choukri,et al. SPEECHDAT-CAR. A Large Speech Database for Automotive Environments , 2000, LREC.

[15] Patrick Kenny,et al. Front-End Factor Analysis for Speaker Verification , 2011, IEEE Transactions on Audio, Speech, and Language Processing.