SCODED: Statistical Constraint Oriented Data Error Detection
暂无分享,去创建一个
Reynold Cheng | Oliver Schulte | Jiannan Wang | Jing Nathan Yan | MoHan Zhang | Jiannan Wang | Reynold Cheng | O. Schulte | J. Yan | Mohan Zhang
[1] Jeff G. Schneider,et al. Anomaly pattern detection in categorical datasets , 2008, KDD.
[2] Sanjay Krishnan,et al. ActiveClean: An Interactive Data Cleaning Framework For Modern Machine Learning , 2016, SIGMOD Conference.
[3] Brian Macdonald. A Regression-Based Adjusted Plus-Minus Statistic for NHL Players , 2010, 1006.4310.
[4] Sam Madden,et al. Outlier Detection in Heterogeneous Datasets using Automatic Tuple Expansion , 2016 .
[5] Tim Kraska,et al. SampleClean: Fast and Reliable Analytics on Dirty Data , 2015, IEEE Data Eng. Bull..
[6] David J. Spiegelhalter,et al. Probabilistic Networks and Expert Systems - Exact Computational Methods for Bayesian Networks , 1999, Information Science and Statistics.
[7] Sunil Prabhakar,et al. ERACER: a database approach for statistical inference and data cleaning , 2010, SIGMOD Conference.
[8] Paolo Papotti,et al. KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing , 2015, SIGMOD Conference.
[9] Jeff G. Schneider,et al. Detecting anomalous records in categorical datasets , 2007, KDD '07.
[10] Yan Liu,et al. Medical data mining: insights from winning two competitions , 2010, Data Mining and Knowledge Discovery.
[11] Theodoros Rekatsinas,et al. HoloDetect: Few-Shot Learning for Error Detection , 2019, SIGMOD Conference.
[12] R. Nelsen,et al. On the relationship between Spearman's rho and Kendall's tau for pairs of continuous random variables , 2007 .
[13] Jilles Vreeken,et al. Discovering Reliable Approximate Functional Dependencies , 2017, KDD.
[14] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[15] Dan Suciu,et al. Towards correcting input data errors probabilistically using integrity constraints , 2006, MobiDE '06.
[16] Michael Stonebraker,et al. Raha: A Configuration-Free Error Detection System , 2019, SIGMOD Conference.
[17] Joseph M. Hellerstein,et al. Quantitative Data Cleaning for Large Databases , 2008 .
[18] Dan Suciu,et al. Bias in OLAP Queries: Detection, Explanation, and Removal , 2018, SIGMOD Conference.
[19] A. Dawid. Conditional Independence in Statistical Theory , 1979 .
[20] Ihab F. Ilyas,et al. Data Cleaning: Overview and Emerging Challenges , 2016, SIGMOD Conference.
[21] Paul Brown,et al. CORDS: automatic discovery of correlations and soft functional dependencies , 2004, SIGMOD '04.
[22] Dan Geiger,et al. d-Separation: From Theorems to Algorithms , 2013, UAI.
[23] Dan Suciu,et al. Capuchin: Causal Database Repair for Algorithmic Fairness , 2019, ArXiv.
[24] Yeye He,et al. Auto-Detect: Data-Driven Error Detection in Tables , 2018, SIGMOD Conference.
[25] Marc Gyssens,et al. On the conditional independence implication problem: A lattice-theoretic approach , 2008, Artif. Intell..
[26] Dan Suciu,et al. HypDB: A Demonstration of Detecting, Explaining and Resolving Bias in OLAP queries , 2018, Proc. VLDB Endow..
[27] Paolo Papotti,et al. Holistic data cleaning: Putting violations into context , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).
[28] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .
[29] Chao Li,et al. Model Trees for Identifying Exceptional Players in the NHL and NBA Drafts , 2018, MLSA@PKDD/ECML.
[30] Paolo Papotti,et al. Discovering Denial Constraints , 2013, Proc. VLDB Endow..
[31] Catherine Dehon,et al. Influence functions of the Spearman and Kendall correlation measures , 2010, Stat. Methods Appl..
[32] Robi Polikar,et al. Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.
[33] Dan Suciu,et al. Integrity Constraints Revisited: From Exact to Approximate Implication , 2018, ICDT.
[34] Ronald Fagin,et al. Multivalued dependencies and a new normal form for relational databases , 1977, TODS.
[35] Felix Naumann,et al. DynFD: Functional Dependency Discovery in Dynamic Datasets , 2019, EDBT.
[36] Larry Wasserman,et al. All of Statistics: A Concise Course in Statistical Inference , 2004 .
[37] Oliver Schulte,et al. Model-Based Outlier Detection for Object-Relational Data , 2015, 2015 IEEE Symposium Series on Computational Intelligence.
[38] Milan Studeny,et al. Conditional independence relations have no finite complete characterization , 1992 .
[39] J. Pearl,et al. Logical and Algorithmic Properties of Conditional Independence and Graphical Models , 1993 .
[40] D. Margaritis. Learning Bayesian Network Model Structure from Data , 2003 .
[41] Ahmed K. Elmagarmid,et al. Don't be SCAREd: use SCalable Automatic REpairing with maximal likelihood and bounded changes , 2013, SIGMOD '13.
[42] Luiz Eduardo Soares de Oliveira,et al. Adapting dynamic classifier selection for concept drift , 2018, Expert Syst. Appl..
[43] Felix Naumann,et al. Data Profiling , 2018, Data Profiling.
[44] Dan Suciu,et al. A formal approach to finding explanations for database queries , 2014, SIGMOD Conference.
[45] Eugene Wu,et al. QFix: Diagnosing Errors through Query Histories , 2016, SIGMOD Conference.
[46] D. Rubinfeld,et al. Hedonic housing prices and the demand for clean air , 1978 .
[47] Laks V. S. Lakshmanan,et al. On approximating optimum repairs for functional dependency violations , 2009, ICDT '09.
[48] D. C. Howell. Statistical Methods for Psychology , 1987 .
[49] Rajeev Rastogi,et al. A cost-based model and effective heuristic for repairing constraints by value modification , 2005, SIGMOD '05.
[50] S. Sullivant. Gaussian conditional independence relations have no finite complete characterization , 2007, 0704.2847.
[51] Yeye He,et al. Uni-Detect: A Unified Approach to Automated Error Detection in Tables , 2019, SIGMOD Conference.
[52] Tova Milo,et al. Query-Oriented Data Cleaning with Oracles , 2015, SIGMOD Conference.
[53] Michael Stonebraker,et al. Detecting Data Errors: Where are we and what needs to be done? , 2016, Proc. VLDB Endow..
[54] Yeung Sam Hung,et al. A comparative analysis of Spearman's rho and Kendall's tau in normal and contaminated normal models , 2013, Signal Process..
[55] David Maxwell Chickering,et al. Finding Optimal Bayesian Networks , 2002, UAI.
[56] Felix Naumann,et al. Detecting unique column combinations on dynamic data , 2014, 2014 IEEE 30th International Conference on Data Engineering.
[57] W. Knight. A Computer Method for Calculating Kendall's Tau with Ungrouped Data , 1966 .
[58] Gustavo Alonso,et al. Declarative Support for Sensor Data Cleaning , 2006, Pervasive.
[59] Ihab F. Ilyas,et al. Trends in Cleaning Relational Data: Consistency and Deduplication , 2015, Found. Trends Databases.
[60] Christopher Ré,et al. The HoloClean Framework Dataset to be cleaned Denial Constraints External Information t 1 t 4 t 2 t 3 Johnnyo ’ s , 2017 .
[61] Dan Wu,et al. On the implication problem for probabilistic conditional independency , 2000, IEEE Trans. Syst. Man Cybern. Part A.
[62] Samuel Madden,et al. Scorpion: Explaining Away Outliers in Aggregate Queries , 2013, Proc. VLDB Endow..
[63] Dan Suciu,et al. Interventional Fairness: Causal Database Repair for Algorithmic Fairness , 2019, SIGMOD Conference.
[64] Wei Hong,et al. TinyDB: an acquisitional query processing system for sensor networks , 2005, TODS.
[65] Alexandra Meliou,et al. Data X-Ray: A Diagnostic Tool for Data Errors , 2015, SIGMOD Conference.