论文信息 - Query-based Why-not Explanations for Nested Data

Query-based Why-not Explanations for Nested Data

We present the first query-based approach for explaining missing answers to queries over nested relational data which is a common data format used by big data systems such as Apache Spark. Our main contributions are a novel way to define query-based why-not provenance based on repairs to queries and presenting an implementation and preliminary experiments for answering such queries in Spark.

Boris Glavic | Melanie Herschel | Ralf Diestelkämper | Seokki Lee

[1] Chen Wang,et al. Extended XML Tree Pattern Matching: Theories and Algorithms , 2011, IEEE Transactions on Knowledge and Data Engineering.

[2] Daniel Deutch,et al. Putting Lipstick on Pig: Enabling Database-style Workflow Provenance , 2011, Proc. VLDB Endow..

[3] Melanie Herschel. A Hybrid Approach to Answering Why-Not Questions on Relational Query Results , 2015, JDIQ.

[4] Miryung Kim,et al. Adding data provenance support to Apache Spark , 2017, The VLDB Journal.

[5] Val Tannen,et al. Annotated XML: queries and provenance , 2008, PODS.

[6] Melanie Herschel,et al. Query-Based Why-Not Provenance with NedExplain , 2014, EDBT.

[7] Nick Koudas,et al. Interactive query refinement , 2009, EDBT '09.

[8] Khalid Belhajjame. On Answering Why-Not Queries Against Scientific Workflow Provenance , 2018, EDBT.

[9] Adriane Chapman,et al. Why Not? , 1965, SIGMOD Conference.

[10] Melanie Herschel,et al. Efficient Computation of Polynomial Explanations of Why-Not Questions , 2015, CIKM.

[11] Quoc Trung Tran,et al. How to ConQueR why-not questions , 2010, SIGMOD Conference.

[12] Tova Milo,et al. Towards Tractable Algebras for Bags , 1996, J. Comput. Syst. Sci..

[13] Melanie Herschel,et al. A survey on provenance: What for? What form? What from? , 2017, The VLDB Journal.