Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers
暂无分享,去创建一个
[1] Cyril Goutte,et al. Discriminating Similar Languages: Evaluations and Explorations , 2016, LREC.
[2] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[3] Marcos Zampieri,et al. N-gram Language Models and POS Distribution for the Identification of Spanish Varieties (Ngrammes et Traits Morphosyntaxiques pour la Identification de Variétés de l’Espagnol) [in French] , 2013, JEP/TALN/RECITAL.
[4] Matthew Purver,et al. A Simple Baseline for Discriminating Similar Languages , 2014, VarDial@COLING.
[5] Evangelos Spiliotis,et al. Statistical and Machine Learning forecasting methods: Concerns and ways forward , 2018, PloS one.
[6] Preslav Nakov,et al. Overview of the DSL Shared Task 2015 , 2015 .
[7] Adrien Barbaresi. Efficient construction of metadata-enhanced web corpora , 2016, WAC@ACL.
[8] Stephan Vogel,et al. Speech recognition challenge in the wild: Arabic MGB-3 , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[9] Jörg Tiedemann,et al. A Report on the DSL Shared Task 2014 , 2014, VarDial@COLING.
[10] Preslav Nakov,et al. Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign , 2018, VarDial@COLING 2018.
[11] Arkaitz Zubiaga,et al. TweetLID: a benchmark for tweet language identification , 2016, Lang. Resour. Evaluation.
[12] Thomas Proisl,et al. SoMaJo: State-of-the-art tokenization for German web and social media texts , 2016, WAC@ACL.
[13] Mario Bertero,et al. The Stability of Inverse Problems , 1980 .
[14] Timothy Baldwin,et al. Automatic Language Identification in Texts: A Survey , 2018, J. Artif. Intell. Res..
[15] Benno Stein,et al. Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter , 2017, CLEF.
[16] Preslav Nakov,et al. Findings of the VarDial Evaluation Campaign 2017 , 2017, VarDial.
[17] A. Barbaresi. Construction de corpus généraux et spécialisés à partir du Web (Ad hoc and general-purpose corpus construction from web sources) , 2015 .
[18] Thomas J. Watson,et al. An empirical study of the naive Bayes classifier , 2001 .
[19] Antal van den Bosch,et al. Exploring Lexical and Syntactic Features for Language Variety Identification , 2017, VarDial.
[20] Marco Lui,et al. Classifying English Documents by National Dialect , 2013, ALTA.
[21] Ritesh Kumar,et al. Automatic Identification of Closely-related Indian Languages: Resources and Experiments , 2018, ArXiv.
[22] Arthur E. Hoerl,et al. Application of ridge analysis to regression problems , 1962 .
[23] Adrien Barbaresi. Discriminating between Similar Languages using Weighted Subword Features , 2017, VarDial.
[24] Yves Scherrer,et al. ArchiMob - A Corpus of Spoken Swiss German , 2016, LREC.
[25] David R. Karger,et al. Tackling the Poor Assumptions of Naive Bayes Text Classifiers , 2003, ICML.
[26] W. B. Cavnar,et al. N-gram-based text categorization , 1994 .
[27] Timothy Baldwin,et al. Language Identification: The Long and the Short of the Matter , 2010, NAACL.
[28] Hans Peter Luhn,et al. A Statistical Approach to Mechanized Encoding and Searching of Literary Information , 1957, IBM J. Res. Dev..
[29] Jörg Tiedemann,et al. Merging Comparable Data Sources for the Discrimination of Similar Languages : The DSL Corpus Collection , 2014, LREC 2014.
[30] Preslav Nakov,et al. Discriminating between Similar Languages and Arabic Dialect Identification: A Report on the Third DSL Shared Task , 2016, VarDial@COLING.
[31] Karen Spärck Jones. A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.
[32] Michael I. Jordan,et al. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.
[33] Adrien Barbaresi,et al. An Unsupervised Morphological Criterion for Discriminating Similar Languages , 2016, VarDial@COLING.