Prevent Low-Quality Analytics by Automatic Selection of the Best-Fitting Training Data
暂无分享,去创建一个
[1] Iryna Gurevych,et al. DKPro Similarity: An Open Source Framework for Text Similarity , 2013, ACL.
[2] Thorsten Brants,et al. TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.
[3] Albert Y. Kim,et al. Hypothesis Testing , 2019, Encyclopedic Dictionary of Archaeology.
[4] Diane M. Strong,et al. Beyond Accuracy: What Data Quality Means to Data Consumers , 1996, J. Manag. Inf. Syst..
[5] Laura Sebastian-Coleman,et al. Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework , 2012 .
[6] Peng Bi,et al. Handbook of Linguistic Annotation , 2018, J. Quant. Linguistics.
[7] Cornelia Kiefer,et al. Assessing the Quality of Unstructured Data: An Initial Overview , 2016, LWDA.
[8] William D. Lewis,et al. Intelligent Selection of Language Model Training Data , 2010, ACL.
[9] El Habib Benlahmar,et al. Survey of Plagiarism Detection Approaches and Big data Techniques related to Plagiarism Candidate Retrieval , 2017, BDCA.
[10] Patrick F. Reidy. An Introduction to Latent Semantic Analysis , 2009 .
[11] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..
[12] Jeff Mielke. A phonetically based metric of sound similarity , 2012 .
[13] Walt Detmar Meurers,et al. Short Answer Assessment: Establishing Links Between Research Strands , 2012, BEA@NAACL-HLT.
[14] José Francisco Martínez Trinidad,et al. A review of instance selection methods , 2010, Artificial Intelligence Review.
[15] Luis González Abril,et al. A similarity measure between videos using alignment, graphical and speech features , 2012, Expert Syst. Appl..
[16] Pascal Hirmer,et al. FlexMash 2.0 - Flexible Modeling and Execution of Data Mashups , 2016, RMC.
[17] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .
[18] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.
[19] Teh Ying Wah,et al. A Comparison Study on Similarity and Dissimilarity Measures in Clustering Continuous Data , 2015, PloS one.
[20] E. Valuations. A REVIEW ON EVALUATION METRICS FOR DATA CLASSIFICATION EVALUATIONS , 2015 .
[21] Daniel Sonntag,et al. Assessing the Quality of Natural Language Text Data , 2004, GI Jahrestagung.
[22] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[23] Jianfeng Gao,et al. Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.
[24] Leon N. Cooper,et al. Training Data Selection for Support Vector Machines , 2005, ICNC.
[25] J. I. The Design of Experiments , 1936, Nature.
[26] Brendan T. O'Connor,et al. Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.
[27] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[28] Christopher D. Manning. Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics? , 2011, CICLing.
[29] Jieping Ye,et al. Learning Adversarial Networks for Semi-Supervised Text Classification via Policy Gradient , 2018, KDD.