Manually Detecting Errors for Data Cleaning Using Adaptive Crowdsourcing Strategies
暂无分享,去创建一个
AnHai Doan | Esteban Arcaute | Paris Koutris | Chengliang Chai | Haojun Zhang | A. Doan | Chengliang Chai | Paris Koutris | Esteban Arcaute | Haojun Zhang
[1] Rob Miller,et al. Crowdsourced Databases: Query Processing with People , 2011, CIDR.
[2] Yannis Papakonstantinou,et al. Waldo: An Adaptive Human Interface for Crowd Entity Resolution , 2017, SIGMOD Conference.
[3] Hector Garcia-Molina,et al. CrowdDQS: Dynamic Question Selection in Crowdsourcing Systems , 2017, SIGMOD Conference.
[4] Michael Stonebraker,et al. Detecting Data Errors: Where are we and what needs to be done? , 2016, Proc. VLDB Endow..
[5] Panagiotis G. Ipeirotis,et al. Quality management on Amazon Mechanical Turk , 2010, HCOMP '10.
[6] J. Leeuw,et al. Isotone Optimization in R: Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods , 2009 .
[7] Ashwin Machanavajjhala,et al. An automatic blocking mechanism for large-scale de-duplication tasks , 2012, CIKM '12.
[8] Shipeng Yu,et al. Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..
[9] Sanjay Krishnan,et al. ActiveClean: Interactive Data Cleaning For Statistical Modeling , 2016, Proc. VLDB Endow..
[10] Lei Chen,et al. CrowdCleaner: Data cleaning for multi-version data on the web via crowdsourcing , 2014, 2014 IEEE 30th International Conference on Data Engineering.
[11] Bo Zhao,et al. The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing , 2014, WWW.
[12] Tim Kraska,et al. CrowdER: Crowdsourcing Entity Resolution , 2012, Proc. VLDB Endow..
[13] Surajit Chaudhuri,et al. Data Debugger: An Operator-Centric Approach for Data Quality Solutions , 2006, IEEE Data Eng. Bull..
[14] George Papastefanatos,et al. Parallel meta-blocking: Realizing scalable entity resolution over large, heterogeneous data , 2015, 2015 IEEE International Conference on Big Data (Big Data).
[15] Ihab F. Ilyas,et al. Data Cleaning: Overview and Emerging Challenges , 2016, SIGMOD Conference.
[16] Jennifer Widom,et al. Query Optimization over Crowdsourced Data , 2013, Proc. VLDB Endow..
[17] Erhard Rahm,et al. Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..
[18] Paolo Papotti,et al. BigDansing: A System for Big Data Cleansing , 2015, SIGMOD Conference.
[19] Eugene Wu,et al. CLAMShell: Speeding up Crowds for Low-latency Data Labeling , 2015, Proc. VLDB Endow..
[20] Beng Chin Ooi,et al. CDAS: A Crowdsourcing Data Analytics System , 2012, Proc. VLDB Endow..
[21] Reynold Cheng,et al. Optimizing Task Assignment for Crowdsourcing Environments , 2013 .
[22] Theodore Johnson,et al. Exploratory Data Mining and Data Cleaning , 2003 .
[23] Andreas Thor,et al. Parallel Sorted Neighborhood Blocking with MapReduce , 2011, BTW.
[24] Paolo Papotti,et al. KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing , 2015, SIGMOD Conference.
[25] Sanjay Krishnan,et al. Wisteria: Nurturing Scalable Data Cleaning Infrastructure , 2015, Proc. VLDB Endow..
[26] Gerardo Hermosillo,et al. Supervised learning from multiple experts: whom to trust when everyone lies a bit , 2009, ICML '09.
[27] Javier R. Movellan,et al. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.
[28] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.
[29] Jeffrey Heer,et al. Predictive Interaction for Data Transformation , 2015, CIDR.
[30] Kai Zhao,et al. Exploring What not to Clean in Urban Data: A Study Using New York City Taxi Trips , 2016, IEEE Data Eng. Bull..
[31] Jeffrey F. Naughton,et al. Corleone: hands-off crowdsourcing for entity matching , 2014, SIGMOD Conference.
[32] Tim Kraska,et al. Leveraging transitive relations for crowdsourced joins , 2013, SIGMOD '13.
[33] Pierre Senellart,et al. CrowdMiner: Mining association rules from the crowd , 2013, Proc. VLDB Endow..
[34] AnHai Doan,et al. Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services , 2017, SIGMOD Conference.
[35] Purnamrita Sarkar,et al. Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning , 2014, Proc. VLDB Endow..
[36] Ihab F. Ilyas,et al. Distributed Data Deduplication , 2016, Proc. VLDB Endow..
[37] Lakshminarayanan Subramanian,et al. Reputation-based Worker Filtering in Crowdsourcing , 2014, NIPS.
[38] Hisashi Kashima,et al. Accurate Integration of Crowdsourced Labels Using Workers' Self-reported Confidence Scores , 2013, IJCAI.
[39] Qili Deng,et al. Deep learning for gender recognition , 2015, 2015 International Conference on Computers, Communications, and Systems (ICCCS).
[40] Surajit Chaudhuri,et al. Towards a Domain Independent Platform for Data Cleaning , 2011, IEEE Data Eng. Bull..
[41] Tim Kraska,et al. CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.
[42] Divesh Srivastava,et al. Global detection of complex copying relationships between sources , 2010, Proc. VLDB Endow..
[43] C. Spearman. The proof and measurement of association between two things. , 2015, International journal of epidemiology.
[44] Dennis Shasha,et al. Declarative Data Cleaning: Language, Model, and Algorithms , 2001, VLDB.
[45] Michael S. Bernstein,et al. Measuring Crowdsourcing Effort with Error-Time Curves , 2015, CHI.
[46] Jennifer Widom,et al. CrowdScreen: algorithms for filtering data with humans , 2012, SIGMOD Conference.
[47] Aditya G. Parameswaran,et al. Answering Queries using Humans, Algorithms and Databases , 2011, CIDR.
[48] Ihab F. Ilyas,et al. Qualitative Data Cleaning , 2016, Proc. VLDB Endow..