Overview and Importance of Data Quality for Machine Learning Tasks
暂无分享,去创建一个
Sameep Mehta | Shazia Afzal | Hima Patel | Shashank Mujumdar | Nitin Gupta | Vitobha Munigala | Ruhi Sharma Mittal | Abhinav Jain | Lokesh Nagalapatti | Shanmukha C. Guttula | S. Mehta | S. Afzal | Vitobha Munigala | Abhinav Jain | Shashank Mujumdar | Nitin Gupta | Hima Patel | Lokesh Nagalapatti
[1] Isaac L. Chuang,et al. Confident Learning: Estimating Uncertainty in Dataset Labels , 2019, J. Artif. Intell. Res..
[2] James Y. Zou,et al. Data Shapley: Equitable Valuation of Data for Machine Learning , 2019, ICML.
[3] M. de Rijke,et al. Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss , 2019, WWW.
[4] Neoklis Polyzotis,et al. Data Validation for Machine Learning , 2019, SysML.
[5] Sumit Gulwani,et al. Automating string processing in spreadsheets using input-output examples , 2011, POPL '11.
[6] Lingjia Tang,et al. Outlier Detection for Improved Data Quality and Diversity in Dialog Systems , 2019, NAACL.
[7] Nikolai Rozanov,et al. Evolutionary Data Measures: Understanding the Difficulty of Text Classification Tasks , 2018, CoNLL.
[8] Charu C. Aggarwal,et al. Outlier Detection for Text Data , 2017, SDM.
[9] Michael J. Muller,et al. How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation , 2019, CHI.
[10] Pushmeet Kohli,et al. RobustFill: Neural Program Learning under Noisy I/O , 2017, ICML.
[11] Beng Chin Ooi,et al. Rapid Identification of Column Heterogeneity , 2006, Sixth International Conference on Data Mining (ICDM'06).
[12] Misha Denil,et al. Overlap versus Imbalance , 2010, Canadian Conference on AI.
[13] GulwaniSumit. Automating string processing in spreadsheets using input-output examples , 2011 .
[14] Cornelia Kiefer. Quality Indicators for Text Data , 2019, BTW.
[15] Sercan O. Arik,et al. Data Valuation using Reinforcement Learning , 2019, ICML.
[16] Helena Galhardas,et al. A Taxonomy of Data Quality Problems , 2005 .
[17] Erhard Rahm,et al. Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..
[18] Maria Liakata,et al. Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets , 2019, ACL.
[19] Laure Berti-Équille,et al. Learn2Clean: Optimizing the Sequence of Tasks for Web Data Preparation , 2019, WWW.