23rd International Conference on Database Theory, ICDT 2020, March 30-April 2, 2020, Copenhagen, Denmark

Probabilistic databases are commonly known in the form of the tuple-independent model, where the validity of every tuple is an independent random event. Conceptually, the notion is more general, as a probabilistic database refers to any probability distribution over ordinary databases. A central computational problem is that of marginal inference for database queries: what is the probability that a given tuple is a query answer? In this talk, I will discuss recent developments in several research directions that, collectively, position probabilistic databases as the common and natural foundation of various challenges at the core of data analytics. Examples include reasoning about uncertain preferences from conventional distributions such as the Mallows model, data cleaning and repairing in probabilistic paradigms such as the HoloClean system, and the explanation of query answers through concepts from cooperative game theory such as the Shapley value and the Banzhaf Power Index. While these challenges manifest different facets of probabilistic databases, I will show how they interrelate and, moreover, how they relate to the basic theory of inference over tuple-independent databases. 2012 ACM Subject Classification Theory of computation → Incomplete, inconsistent, and uncertain databases; Mathematics of computing → Probabilistic representations; Information systems → Data model extensions

[1]  Dan Suciu,et al.  What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another? , 2016, PODS.

[2]  Michael J. Maher,et al.  Chasing constrained tuple-generating dependencies , 1996, PODS.

[3]  Christopher Ré,et al.  MYSTIQ: a system for finding more answers by using probabilities , 2005, SIGMOD '05.

[4]  Benjamin Rossman,et al.  Homomorphism preservation theorems , 2008, JACM.

[5]  Georg Gottlob,et al.  The Space-Efficient Core of Vadalog , 2018, PODS.

[6]  Atri Rudra,et al.  Skew strikes back: new developments in the theory of join algorithms , 2013, SGMD.

[7]  Shirish Tatikonda,et al.  SystemML: Declarative Machine Learning on Spark , 2016, Proc. VLDB Endow..

[8]  Christian Herrmann Corrigendum to "On the undecidability of implications between embedded multivalued database dependencies" [Inform. and Comput. 122(1995) 221-235] , 2006, Inf. Comput..

[9]  Anuj Dawar,et al.  On the Descriptive Complexity of Linear Algebra , 2008, WoLLIC.

[10]  Sebastian Rudolph,et al.  Expressivity of Datalog Variants - Completing the Picture , 2016, IJCAI.

[11]  Dan Suciu,et al.  Causality in Databases , 2010, IEEE Data Eng. Bull..

[12]  Raymond W. Yeung,et al.  A framework for linear information inequalities , 1997, IEEE Trans. Inf. Theory.

[13]  André Elisseeff,et al.  Using Markov Blankets for Causal Structure Learning , 2008, J. Mach. Learn. Res..

[14]  Balder ten Cate,et al.  Declarative Probabilistic Programming with Datalog , 2017, ACM Trans. Database Syst..

[15]  RONALD FAGIN,et al.  Document Spanners , 2015, J. ACM.

[16]  Jianwen Su,et al.  Nonrecursive incremental evaluation of Datalog queries , 1995, Annals of Mathematics and Artificial Intelligence.

[17]  Assaf Schuster,et al.  Lazy evaluation methods for detecting complex events , 2015, DEBS.

[18]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[19]  Dan Suciu,et al.  A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries , 2017, PODS.

[20]  Pradeep Dubey,et al.  Mathematical Properties of the Banzhaf Power Index , 1979, Math. Oper. Res..

[21]  Yi Lu,et al.  AdaptDB: Adaptive Partitioning for Distributed Joins , 2017, Proc. VLDB Endow..

[22]  Victor Shoup,et al.  Practical Threshold Signatures , 2000, EUROCRYPT.

[23]  Faisal Nawab,et al.  Blockplane: A Global-Scale Byzantizing Middleware , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[24]  S. Axler Linear Algebra Done Right , 1995, Undergraduate Texts in Mathematics.

[25]  Ghassan O. Karame,et al.  Scalable Byzantine Consensus via Hardware-Assisted Secret Sharing , 2016, IEEE Transactions on Computers.

[26]  Johannes Gehrke,et al.  What is "next" in event processing? , 2007, PODS.

[27]  Dan Suciu,et al.  The Complexity of Causality and Responsibility for Query Answers and non-Answers , 2010, Proc. VLDB Endow..

[28]  Chris Jermaine,et al.  Declarative Recursive Computation on an RDBMS , 2019, Proc. VLDB Endow..

[29]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[30]  Michael Stonebraker,et al.  Clay: Fine-Grained Adaptive Partitioning for General Database Schemas , 2016, Proc. VLDB Endow..

[31]  Stéphane Coulondre,et al.  A sound and complete chase procedure for constrained tuple-generating dependencies , 2012, Journal of Intelligent Information Systems.

[32]  Jeffrey F. Naughton,et al.  Towards Linear Algebra over Normalized Data , 2016, Proc. VLDB Endow..

[33]  Phokion G. Kolaitis,et al.  On the expressive power of datalog: tools and a case study , 1990, J. Comput. Syst. Sci..

[34]  Esko Valkeila,et al.  An Introduction to the Theory of Point Processes, Volume II: General Theory and Structure, 2nd Edition by Daryl J. Daley, David Vere‐Jones , 2008 .

[35]  Michael O. Rabin,et al.  Efficient dispersal of information for security, load balancing, and fault tolerance , 1989, JACM.

[36]  Ralph C. Merkle,et al.  A Digital Signature Based on a Conventional Encryption Function , 1987, CRYPTO.

[37]  Markus Kröll,et al.  Complexity Bounds for Relational Algebra over Document Spanners , 2019, PODS.

[38]  Vivien Quéma,et al.  RBFT: Redundant Byzantine Fault Tolerance , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems.

[39]  Aws Albarghouthi,et al.  Distribution Policies for Datalog , 2018, ICDT.

[40]  Vishal Misra,et al.  Internet Economics: The Use of Shapley Value for ISP Settlement , 2007, IEEE/ACM Transactions on Networking.

[41]  Dan Wu,et al.  On the implication problem for probabilistic conditional independency , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[42]  Jennifer Widom,et al.  Making Aggregation Work in Uncertain and Probabilistic Databases , 2011, IEEE Transactions on Knowledge and Data Engineering.

[43]  Martin Corless,et al.  Linear systems and control : an operator perspective , 2003 .

[44]  Badrish Chandramouli,et al.  Trill: A High-Performance Incremental Query Processor for Diverse Analytics , 2014, Proc. VLDB Endow..

[45]  Dan Suciu,et al.  Bias in OLAP Queries: Detection, Explanation, and Removal , 2018, SIGMOD Conference.

[46]  Arto Salomaa,et al.  Pattern languages with and without erasing , 1994 .

[47]  Noga Alon,et al.  Scalable Secure Storage when Half the System Is Faulty , 2000, ICALP.

[48]  Frank Neven,et al.  Split-Correctness in Information Extraction , 2018, PODS.

[49]  Ashok K. Chandra,et al.  Optimal implementation of conjunctive queries in relational data bases , 1977, STOC '77.

[50]  Dan Suciu Communication Cost in Parallel Query Evaluation: A Tutorial , 2017, PODS.

[51]  Christopher Ré,et al.  Probabilistic databases: diamonds in the dirt , 2009, CACM.

[52]  Francesco Scarcello,et al.  Structural Tractability of Shapley and Banzhaf Values in Allocation Games , 2015, IJCAI.

[53]  Yin Yang,et al.  LinBFT: Linear-Communication Byzantine Fault Tolerance for Public Blockchains , 2018, ArXiv.

[54]  DeyDebabrata,et al.  A probabilistic relational model and algebra , 1996 .

[55]  Valmir Carneiro Barbosa,et al.  An introduction to distributed algorithms , 1996 .

[56]  Norbert Fuhr,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997, TOIS.

[57]  Alfred V. Aho,et al.  Algorithms for Finding Patterns in Strings , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[58]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[59]  Roland Bacher,et al.  Determinants of matrices related to the Pascal triangle , 2002 .

[60]  Monika Henzinger,et al.  Unifying and Strengthening Hardness for Dynamic Problems via the Online Matrix-Vector Multiplication Conjecture , 2015, STOC.

[61]  Thomas Eiter,et al.  Uniform Equivalence of Logic Programs under the Stable Model Semantics , 2003, ICLP.

[62]  Tatiana Nenova,et al.  The value of corporate voting rights and control: A cross-country analysis , 2003 .

[63]  Susanne E. Hambrusch,et al.  Database Support for Probabilistic Attributes and Tuples , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[64]  Dániel Marx,et al.  Size Bounds and Query Plans for Relational Joins , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[65]  L. Shapley,et al.  The Shapley Value , 1994 .

[66]  Frantisek Matús,et al.  Infinitely Many Information Inequalities , 2007, 2007 IEEE International Symposium on Information Theory.

[67]  Yehuda Lindell,et al.  Introduction to Modern Cryptography , 2004 .

[68]  Francesco M. Malvestuto,et al.  Statistical treatment of the information content of a database , 1986, Inf. Syst..

[69]  Pedro M. Domingos,et al.  Markov Logic in Infinite Domains , 2007, UAI.

[70]  Matthias Weidlich,et al.  Complex Event Recognition Languages: Tutorial , 2017, DEBS.

[71]  Limsoon Wong,et al.  Query languages for bags: expressive power and complexity , 1996, SIGA.

[72]  Marko Vukolic,et al.  Blockchain Consensus Protocols in the Wild (Keynote Talk) , 2017, DISC.

[73]  Werner Nutt,et al.  Deciding equivalences among conjunctive aggregate queries , 2007, JACM.

[74]  Martin Grohe,et al.  Probabilistic Databases with an Infinite Open-World Assumption , 2018, PODS.

[75]  Johann A. Makowsky,et al.  Embedded implicational dependencies and their inference problem , 1981, STOC '81.

[76]  Stuart J. Russell,et al.  BLOG: Probabilistic Models with Unknown Objects , 2005, IJCAI.

[77]  Edward L. Robertson,et al.  On approximation measures for functional dependencies , 2004, Inf. Syst..

[78]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[79]  Leslie Lamport,et al.  The Implementation of Reliable Distributed Multiprocess Systems , 1978, Comput. Networks.

[80]  Dan Olteanu,et al.  Learning Linear Regression Models over Factorized Joins , 2016, SIGMOD Conference.

[81]  Raymond W. Yeung,et al.  Information Theory and Network Coding , 2008 .

[82]  Dan Olteanu,et al.  Conditioning probabilistic databases , 2008, Proc. VLDB Endow..

[83]  Kevin A Clauson,et al.  Geospatial blockchain: promises, challenges, and scenarios in health and healthcare , 2018, International Journal of Health Geographics.

[84]  Vincent Conitzer,et al.  Computing Shapley Values, Manipulating Value Division Schemes, and Checking Core Membership in Multi-Issue Domains , 2004, AAAI.

[85]  Senthil Nathan,et al.  Blockchain Meets Database: Design and Implementation of a Blockchain Relational Database , 2019, Proc. VLDB Endow..

[86]  Faruk Gul Bargaining Foundations of Shapley Value , 1989 .

[87]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[88]  Michaël Thomazo,et al.  On the complexity of entailment in existential conjunctive first-order logic with atomic negation , 2012, Inf. Comput..

[89]  Jelle Hellings,et al.  Brief Announcement: The Fault-Tolerant Cluster-Sending Problem , 2019, DISC.

[90]  Seinosuke Toda,et al.  PP is as Hard as the Polynomial-Time Hierarchy , 1991, SIAM J. Comput..

[91]  Ronald Fagin,et al.  Data exchange: semantics and query answering , 2003, Theor. Comput. Sci..

[92]  Frank Wolter,et al.  Ontology-Mediated Query Answering over Temporal Data: A Survey (Invited Talk) , 2017, TIME.

[93]  Anuj Dawar,et al.  Pebble Games with Algebraic Rules , 2012, Fundam. Informaticae.

[94]  Toon Calders,et al.  Axiomatization of Frequent Sets , 2001, ICDT.

[95]  Silvio Micali,et al.  Algorand: Scaling Byzantine Agreements for Cryptocurrencies , 2017, IACR Cryptol. ePrint Arch..

[96]  Thomas Zeume,et al.  Dynamic Graph Queries , 2015, ICDT.

[97]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[98]  Peter J. Haas,et al.  MCDB: a monte carlo approach to managing uncertain data , 2008, SIGMOD Conference.

[99]  Alexei P. Stolboushkin Finitely monotone properties , 1995, Proceedings of Tenth Annual IEEE Symposium on Logic in Computer Science.

[100]  Johannes Behl,et al.  CheapBFT: resource-efficient byzantine fault tolerance , 2012, EuroSys '12.

[101]  Martin Grohe,et al.  Descriptive complexity of linear equation systems and applications to propositional proof complexity , 2017, 2017 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[102]  Dominik D. Freydenberger,et al.  Document Spanners: From Expressive Power to Decision Problems , 2017, Theory of Computing Systems.

[103]  Suyash Gupta,et al.  Brief Announcement: Revisiting Consensus Protocols through Wait-Free Parallelization , 2019, DISC.

[104]  Michael Dahlin,et al.  Making Byzantine Fault Tolerant Systems Tolerate Byzantine Faults , 2009, NSDI.

[105]  Patrick Valduriez,et al.  Principles of Distributed Database Systems, Third Edition , 2011 .

[106]  Neil Immerman,et al.  Efficient pattern matching over event streams , 2008, SIGMOD Conference.

[107]  Rachid Guerraoui,et al.  How Fast can a Distributed Transaction Commit? , 2017, PODS.

[108]  Charles R. Johnson,et al.  Matrix Analysis, 2nd Ed , 2012 .

[109]  Ronald Fagin,et al.  Composing schema mappings: second-order dependencies to the rescue , 2004, PODS '04.

[110]  Yehoshua Sagiv,et al.  Optimizing datalog programs , 1987, Foundations of Deductive Databases and Logic Programming..

[111]  Guy Van den Broeck,et al.  Open World Probabilistic Databases (Extended Abstract) , 2016, Description Logics.

[112]  Luc Segoufin,et al.  Enumerating with constant delay the answers to a query , 2013, ICDT '13.

[113]  Marcelo Arenas,et al.  Foundations of Data Exchange , 2014 .

[114]  Ihab F. Ilyas,et al.  Trends in Cleaning Relational Data: Consistency and Deduplication , 2015, Found. Trends Databases.

[115]  Yanlei Diao,et al.  High-performance complex event processing over streams , 2006, SIGMOD Conference.

[116]  Ion Stoica,et al.  Declarative networking: language, execution and optimization , 2006, SIGMOD Conference.

[117]  Todd L. Veldhuizen,et al.  Leapfrog Triejoin: A Simple, Worst-Case Optimal Join Algorithm , 2012, 1210.0481.

[118]  Jun Yang,et al.  Data Management in Machine Learning Systems , 2019, Data Management in Machine Learning Systems.

[119]  Neil Immerman,et al.  An optimal lower bound on the number of variables for graph identification , 1989, 30th Annual Symposium on Foundations of Computer Science.

[120]  Atri Rudra,et al.  FAQ: Questions Asked Frequently , 2015, PODS.

[121]  Ramakrishna Kotla,et al.  Zyzzyva , 2007, SOSP.

[122]  Catriel Beeri,et al.  The Implication Problem for Data Dependencies , 1981, ICALP.

[123]  Hung Q. Ngo,et al.  In-Database Learning with Sparse Tensors , 2017, PODS.

[124]  Ronald Fagin,et al.  An Equivalence Between Relational Database Dependencies and a Fragment of Propositional Logic , 1981, JACM.

[125]  Ron van der Meyden,et al.  The complexity of querying indefinite data about linearly ordered domains , 1992, J. Comput. Syst. Sci..

[126]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[127]  Maurice van Keulen,et al.  Revisiting the formal foundation of Probabilistic Databases , 2015, IFSA-EUSFLAT.

[128]  Atri Rudra,et al.  Beyond worst-case analysis for joins with minesweeper , 2014, PODS.

[129]  Joseph Albert,et al.  Algebraic Properties of Bag Data Types , 1991, VLDB.

[130]  Andrea Calì,et al.  Taming the Infinite Chase: Query Answering under Expressive Relational Constraints , 2008, Description Logics.

[131]  Werner Kirsch,et al.  Power indices and minimal winning coalitions , 2008, Soc. Choice Welf..

[132]  Pedro M. Domingos,et al.  Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[133]  Guy Van den Broeck,et al.  Quantifying Causal Effects on Query Answering in Databases , 2016, TaPP.

[134]  Stuart J. Russell,et al.  Probabilistic models with unknown objects , 2006 .

[135]  Guy Van den Broeck,et al.  On Constrained Open-World Probabilistic Databases , 2018, IJCAI.

[136]  Pablo Barceló,et al.  On the Expressiveness of LARA: A Unified Language for Linear and Relational Algebra , 2019, ICDT.

[137]  Jan Van den Bussche,et al.  Putting logic-based distributed systems on stable grounds , 2015, Theory and Practice of Logic Programming.

[138]  Madalina Croitoru,et al.  Inconsistency Measures for Repair Semantics in OBDA , 2018, IJCAI.

[139]  A. Fleischmann Distributed Systems , 1994, Springer Berlin Heidelberg.

[140]  Stijn Vansummeren,et al.  The Dynamic Yannakakis Algorithm: Compact and Efficient Query Processing Under Updates , 2017, SIGMOD Conference.

[141]  Sebastian Rudolph,et al.  Walking the Complexity Lines for Generalized Guarded Existential Rules , 2011, IJCAI.

[142]  Bruno Courcelle,et al.  Linear delay enumeration and monadic second-order logic , 2009, Discret. Appl. Math..

[143]  Mikell P. Groover,et al.  Automation, Production Systems, and Computer-Integrated Manufacturing , 1987 .

[144]  Katja Losemann Foundations of Regular Languages for Processing RDF and XML , 2015 .

[145]  Tova Milo,et al.  Towards Tractable Algebras for Bags , 1996, J. Comput. Syst. Sci..

[146]  Dominik D. Freydenberger A Logic for Document Spanners , 2018, Theory of Computing Systems.

[147]  Thomas Lukasiewicz,et al.  Ontology-Mediated Queries for Probabilistic Databases , 2017, AAAI.

[148]  Surajit Chaudhuri,et al.  On the equivalence of recursive and nonrecursive datalog programs , 1992, J. Comput. Syst. Sci..

[149]  Dan Suciu,et al.  The dichotomy of conjunctive queries on probabilistic structures , 2006, PODS.

[150]  Nieves R. Brisaboa,et al.  Compact Querieable Representations of Raster Data , 2013, SPIRE.

[151]  Milan Studený Convex cones in finite-dimensional real vector spaces , 1993, Kybernetika.

[152]  Peter J. Haas,et al.  The monte carlo database system: Stochastic analysis close to the data , 2011, TODS.

[153]  Guy Van den Broeck,et al.  Query Processing on Probabilistic Data: A Survey , 2017, Found. Trends Databases.

[154]  Sunil Prabhakar,et al.  Evaluating probabilistic queries over imprecise data , 2003, SIGMOD '03.

[155]  Zhen Zhang,et al.  On Characterization of Entropy Function via Information Inequalities , 1998, IEEE Trans. Inf. Theory.

[156]  G. Zaccour,et al.  Time-consistent Shapley value allocation of pollution cost reduction , 1999 .

[157]  Ronald Fagin,et al.  Inclusion dependencies and their interaction with functional dependencies , 1982, PODS.

[158]  Dirk Van Gucht,et al.  Differential constraints , 2005, PODS '05.

[159]  Mario Thüne,et al.  Eigenvalues of Matrices and Graphs , 2012 .

[160]  Stijn Vansummeren,et al.  Constant Delay Algorithms for Regular Document Spanners , 2018, PODS.

[161]  Volker Markl,et al.  Bridging the gap: towards optimization across linear and relational algebra , 2016, BeyondMR@SIGMOD.

[162]  Michael Pittarelli,et al.  The Theory of Probabilistic Databases , 1987, VLDB.

[163]  Esteban Zimányi,et al.  Query Evaluation in Probabilistic Relational Databases , 1997, Theor. Comput. Sci..

[164]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB journal.

[165]  Antoine Amarilli,et al.  Constant-Delay Enumeration for Nondeterministic Document Spanners , 2019, ICDT.

[166]  E. Lander,et al.  Describing Graphs: A First-Order Approach to Graph Canonization , 1990 .

[167]  Carsten Binnig,et al.  BlockchainDB - A Shared Database on Blockchains , 2019, Proc. VLDB Endow..

[168]  Edward R. Scheinerman,et al.  Fractional isomorphism of graphs , 1994, Discret. Math..

[169]  Hector Garcia-Molina,et al.  The Management of Probabilistic Data , 1992, IEEE Trans. Knowl. Data Eng..

[170]  Floris Geerts,et al.  On the Expressive Power of Linear Algebra on Graphs , 2018, ICDT.

[171]  Eugene Wong,et al.  A statistical approach to incomplete information in database systems , 1982, TODS.

[172]  Chen Li,et al.  Data exchange in the presence of arithmetic comparisons , 2008, EDBT '08.

[173]  Sheng Yu,et al.  A Formal Study Of Practical Regular Expressions , 2003, Int. J. Found. Comput. Sci..

[174]  Randy H. Katz,et al.  An extended relational algebra with control over duplicate elimination , 1982, PODS.

[175]  Jan Van den Bussche,et al.  On matrices and K-relations , 2020, FoIKS.

[176]  Andrew McGregor,et al.  CLARO: modeling and processing uncertain data streams , 2012, The VLDB Journal.

[177]  Joseph M. Hellerstein,et al.  The declarative imperative: experiences and conjectures in distributed logic , 2010, SGMD.

[178]  Dan Olteanu,et al.  Aggregation in Probabilistic Databases via Knowledge Compilation , 2012, Proc. VLDB Endow..

[179]  Serge Abiteboul,et al.  A rule-based language for web data management , 2011, PODS.

[180]  Daniel Deutch,et al.  On probabilistic fixpoint and Markov chain query languages , 2010, PODS '10.

[181]  Catriel Beeri,et al.  A complete axiomatization for functional and multivalued dependencies in database relations , 1977, SIGMOD '77.

[182]  Mihalis Yannakakis,et al.  On Datalog vs. Polynomial Time , 1995, J. Comput. Syst. Sci..

[183]  Christian Catalini,et al.  Blockchain Technology for Healthcare: Facilitating the Transition to Patient-Driven Interoperability , 2018, Computational and structural biotechnology journal.

[184]  Johannes Gehrke,et al.  A General Algebra and Implementation for Monitoring Event Streams , 2005 .

[185]  Dennis Leech,et al.  Power indices and probabilistic voting assumptions , 1990 .

[186]  Stijn Vansummeren,et al.  A Second-Order Approach to Complex Event Recognition , 2017, ArXiv.

[187]  Jennifer Widom,et al.  Deriving Production Rules for Incremental View Maintenance , 1991, VLDB.

[188]  Serge Abiteboul,et al.  Foundations of Databases: The Logical Level , 1995 .

[189]  Blockchain in Europe : Closing the Strategy Gap , 2018 .

[190]  Y. Narahari,et al.  A Shapley Value-Based Approach to Discover Influential Nodes in Social Networks , 2011, IEEE Transactions on Automation Science and Engineering.

[191]  William E. Winkler,et al.  Data quality and record linkage techniques , 2007 .

[192]  Nicole Schweikardt,et al.  Answering Conjunctive Queries under Updates , 2017, PODS.

[193]  O. Macchi The coincidence approach to stochastic point processes , 1975, Advances in Applied Probability.

[194]  Xiaoling Li,et al.  A survey of queries over uncertain data , 2013, Knowledge and Information Systems.

[195]  Thomas Redman,et al.  The impact of poor data quality on the typical enterprise , 1998, CACM.

[196]  Naihuan Jing Unitary and orthogonal equivalence of sets of matrices , 2015 .

[197]  Atri Rudra,et al.  Join Processing for Graph Patterns: An Old Dog with New Tricks , 2015, GRADES@SIGMOD/PODS.

[198]  John Lane,et al.  Steward: Scaling Byzantine Fault-Tolerant Replication to Wide Area Networks , 2010, IEEE Transactions on Dependable and Secure Computing.

[199]  Thomas Schwentick,et al.  Parallel-Correctness and Transferability for Conjunctive Queries , 2014, J. ACM.

[200]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[201]  Michael Pittarelli,et al.  An Algebra for Probabilistic Databases , 1994, IEEE Trans. Knowl. Data Eng..

[202]  Cristian Riveros,et al.  Document Spanners for Extracting Incomplete Information: Expressiveness and Complexity , 2018, PODS.

[203]  Theodore Johnson,et al.  Gigascope: high performance network monitoring with an SQL interface , 2002, SIGMOD '02.

[204]  Daryl J. Daley,et al.  An Introduction to the Theory of Point Processes , 2013 .

[205]  Georg Gottlob,et al.  Normalization and optimization of schema mappings , 2009, The VLDB Journal.

[206]  Bin Jiang,et al.  Probabilistic Skylines on Uncertain Data , 2007, VLDB.

[207]  Anthony Hunter,et al.  On the measure of conflicts: Shapley Inconsistency Values , 2010, Artif. Intell..

[208]  Mohammad Sadoghi,et al.  EasyCommit: A Non-blocking Two-phase Commit Protocol , 2018, EDBT.

[209]  Dan Suciu,et al.  LaraDB: A Minimalist Kernel for Linear and Relational Algebra Computation , 2017, BeyondMR@SIGMOD.

[210]  Jan Top,et al.  Blockchain for agriculture and food: Findings from the pilot study , 2017 .

[211]  Philippe Bonnet,et al.  GADT: a probability space ADT for representing and querying the physical world , 2002, Proceedings 18th International Conference on Data Engineering.

[212]  Tony T. Lee,et al.  An Infornation-Theoretic Analysis of Relational Databases—Part I: Data Dependencies and Information Metric , 1987, IEEE Transactions on Software Engineering.

[213]  Judea Pearl,et al.  GRAPHOIDS: Graph-Based Logic for Reasoning about Relevance Relations OrWhen Would x Tell You More about y If You Already Know z? , 1986, ECAI.

[214]  Todd L. Heberlein,et al.  Network intrusion detection , 1994, IEEE Network.

[215]  Dan Suciu,et al.  Integrity Constraints Revisited: From Exact to Approximate Implication , 2018, ICDT.

[216]  Frederick Reiss,et al.  Compressed linear algebra for large-scale machine learning , 2016, The VLDB Journal.

[217]  Johannes Gehrke,et al.  Towards Expressive Publish/Subscribe Systems , 2006, EDBT.

[218]  Albert Atserias,et al.  Sherali-Adams relaxations and indistinguishability in counting logics , 2012, ITCS '12.

[219]  Thomas Schwentick,et al.  Dynamic conjunctive queries , 2017, J. Comput. Syst. Sci..

[220]  Theodore Johnson,et al.  Gigascope: a stream database for network applications , 2003, SIGMOD '03.

[221]  Neil Immerman,et al.  On complexity and optimization of expensive queries in complex event processing , 2014, SIGMOD Conference.

[222]  Jan Van den Bussche,et al.  On the Expressive Power of Query Languages for Matrices , 2017, ICDT.

[223]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[224]  Abraham Silberschatz,et al.  Distributed processing of logic programs , 1988, SIGMOD '88.

[225]  Adrian Onet,et al.  The chase procedure and its applications , 2012 .

[226]  Yi Wu,et al.  Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms , 2018, ICML.

[227]  Martin Grohe,et al.  PEBBLE GAMES AND LINEAR EQUATIONS , 2012, The Journal of Symbolic Logic.

[228]  Jayanthi Ranjan,et al.  Real time business intelligence in supply chain analytics , 2008, Inf. Manag. Comput. Secur..

[229]  E. Scheinerman,et al.  Fractional Graph Theory: A Rational Approach to the Theory of Graphs , 1997 .

[230]  Jonghyun Choi,et al.  Learning Temporal Regularity in Video Sequences , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[231]  Norbert Fuhr,et al.  Probabilistic Datalog—a logic for powerful retrieval methods , 1995, SIGIR '95.

[232]  Thomas Lukasiewicz,et al.  Recent Advances in Querying Probabilistic Knowledge Bases , 2018, IJCAI.

[233]  Dan Suciu,et al.  Continuous Uncertainty in Trio , 2009, MUD.

[234]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[235]  Bertram Ludäscher,et al.  Win-move is coordination-free (sometimes) , 2012, ICDT '12.

[236]  Guillaume Bagan,et al.  MSO Queries on Tree Decomposable Structures Are Computable with Linear Delay , 2006, CSL.

[237]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[238]  Jean-Philippe Martin,et al.  Fast Byzantine Consensus , 2006, IEEE Transactions on Dependable and Secure Computing.

[239]  Noga Alon,et al.  Addendum to "Scalable secure storage when half the system is faulty" [Inform. Comput 174 (2)(2002) 203-213] , 2007, Inf. Comput..

[240]  Paliath Narendran,et al.  On Extended Regular Expressions , 2009, LATA.

[241]  Gonzalo Navarro,et al.  Compact Data Structures - A Practical Approach , 2016 .

[242]  Andrei E. Romashchenko,et al.  Conditional Information Inequalities for Entropic and Almost Entropic Points , 2012, IEEE Transactions on Information Theory.

[243]  Thomas Schwentick,et al.  The dynamic complexity of formal languages , 2008, TOCL.

[244]  Travis Gagie,et al.  Faster Compressed Quadtrees , 2014, 2015 Data Compression Conference.

[245]  David S. Wise,et al.  Costs of Quadtree Representation of Nondense Matrices , 1990, J. Parallel Distributed Comput..

[246]  Joseph Y. Halpern,et al.  Causes and Explanations: A Structural-Model Approach. Part I: Causes , 2000, The British Journal for the Philosophy of Science.

[247]  Leopoldo E. Bertossi,et al.  Causes for query answers from databases: Datalog abduction, view-updates, and integrity constraints , 2016, Int. J. Approx. Reason..

[248]  Matt Juden Blockchain and Economic Development : Hype vs . Reality , 2017 .

[249]  Claude Delobel,et al.  Decompositions and functional dependencies in relations , 1980, TODS.

[250]  Benny Pinkas,et al.  SBFT: A Scalable and Decentralized Trust Infrastructure , 2018, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[251]  Anuj Dawar,et al.  Approximations of Isomorphism and Logics with Linear-Algebraic Operators , 2019, ICALP.

[252]  Johannes Gehrke,et al.  Cayuga: A General Purpose Event Monitoring System , 2007, CIDR.

[253]  Opher Etzion,et al.  Event Processing in Action , 2010 .

[254]  Carsten Binnig,et al.  Locality-aware Partitioning in Parallel Database Systems , 2015, SIGMOD Conference.

[255]  Milan Studeny,et al.  Conditional independence relations have no finite complete characterization , 1992 .

[256]  Willem H. Haemers,et al.  Cospectral Graphs and the Generalized Adjacency Matrix , 2006 .

[257]  Alfons Kemper,et al.  Locality-sensitive operators for parallel main-memory database clusters , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[258]  Dale Skeen,et al.  A Quorum-Based Commit Protocol , 1982, Berkeley Workshop.

[259]  Alexander Artikis,et al.  An Event Calculus for Event Recognition , 2015, IEEE Transactions on Knowledge and Data Engineering.

[260]  Parag Agrawal,et al.  Trio: a system for data, uncertainty, and lineage , 2006, VLDB.

[261]  Erol Gelenbe,et al.  A probability model of uncertainty in data bases , 1986, 1986 IEEE Second International Conference on Data Engineering.

[262]  Martin Grohe,et al.  Descriptive Complexity, Canonisation, and Definable Graph Structure Theory , 2017, Lecture Notes in Logic.

[263]  Kristian Kersting,et al.  Dimension Reduction via Colour Refinement , 2013, ESA.

[264]  Dan Suciu,et al.  WHY SO? or WHY NO? Functional Causality for Explaining Query Answers , 2009, MUD.

[265]  Felix Naumann,et al.  Efficient Discovery of Approximate Dependencies , 2018, Proc. VLDB Endow..

[266]  Ronald Fagin,et al.  Recursive Programs for Document Spanners , 2017, ICDT.

[267]  Ittai Abraham,et al.  HotStuff: BFT Consensus with Linearity and Responsiveness , 2019, PODC.

[268]  Luc Segoufin Automata and Logics for Words and Trees over an Infinite Alphabet , 2006, CSL.

[269]  Marianne Baudinet,et al.  Constraint-Generating Dependencies , 1994, PPCP.

[270]  Stephan Kreutzer,et al.  On Datalog vs. LFP , 2008, ICALP.

[271]  Dan Suciu,et al.  Parallel evaluation of conjunctive queries , 2011, PODS.

[272]  Wolfgang Lehner,et al.  Conjunctive Queries with Inequalities Under Updates , 2018, Proc. VLDB Endow..

[273]  Yuri Gurevich,et al.  Datalog vs. first-order logic , 1989, 30th Annual Symposium on Foundations of Computer Science.

[274]  Dan Suciu,et al.  The dichotomy of probabilistic inference for unions of conjunctive queries , 2012, JACM.

[275]  Hans P. Reiser,et al.  Scaling Byzantine Consensus: A Broad Analysis , 2018, SERIAL@Middleware.

[276]  Alessandro Margara,et al.  Complex event processing with T-REX , 2012, J. Syst. Softw..

[277]  Felix Naumann,et al.  Approximate Discovery of Functional Dependencies for Large Datasets , 2016, CIKM.

[278]  David Maier,et al.  Dedalus: Datalog in Time and Space , 2010, Datalog.

[279]  Randall Dougherty,et al.  Linear rank inequalities on five or more variables , 2009, ArXiv.

[280]  Maurice Herlihy,et al.  Blockchains from a distributed computing perspective , 2019, Commun. ACM.

[281]  Frank Neven,et al.  Weaker Forms of Monotonicity for Declarative Networking , 2014, ACM Trans. Database Syst..

[282]  Paul Brown,et al.  GORDIAN: efficient and scalable discovery of composite keys , 2006, VLDB.

[283]  Dan Olteanu,et al.  A Layered Aggregate Engine for Analytics Workloads , 2019, SIGMOD Conference.

[284]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[285]  Michael N. Gubanov,et al.  Scalable Linear Algebra on a Relational Database System , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[286]  Anuj Dawar,et al.  Logics with Rank Operators , 2009, 2009 24th Annual IEEE Symposium on Logic In Computer Science.

[287]  Mehmet M. Dalkilic,et al.  Information dependencies , 2000, PODS '00.

[288]  Robert E. Tarjan,et al.  Making data structures persistent , 1986, STOC '86.

[289]  Pedro M. Domingos,et al.  The Sum-Product Theorem: A Foundation for Learning Tractable Models , 2016, ICML.

[290]  Joseph Y. Halpern,et al.  Actual Causality , 2016, A Logical Theory of Causality.

[291]  Yuri Gurevich,et al.  Monotone versus positive , 1987, JACM.

[292]  Adrian Baddeley,et al.  Spatial Point Processes and their Applications , 2007 .

[293]  Ronald P. S. Mahler,et al.  Statistical Multisource-Multitarget Information Fusion , 2007 .

[294]  L. Shapley A Value for n-person Games , 1988 .

[295]  Thomas Schwentick,et al.  Parallel-Correctness and Parallel-Boundedness for Datalog Programs , 2019, ICDT.

[296]  Jeffrey D. Ullman,et al.  Information integration using logical views , 1997, Theor. Comput. Sci..

[297]  Harald Vranken,et al.  Sustainability of bitcoin and blockchains , 2017 .

[298]  Charu C. Aggarwal,et al.  MayBMS A System for Managing Large Probabilistic Databases , 2009 .

[299]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[300]  Nieves R. Brisaboa,et al.  Compressed vertical partitioning for efficient RDF management , 2014, Knowledge and Information Systems.

[301]  Christoph Koch,et al.  Incremental query evaluation in a ring of databases , 2010, PODS.

[302]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[303]  Serge Abiteboul,et al.  Collaborative Access Control in WebdamLog , 2015, SIGMOD Conference.

[304]  S. Ammous,et al.  Blockchain Technology: What is it Good for? , 2016 .

[305]  Sebastian Link,et al.  Independence in Database Relations , 2013, WoLLIC.

[306]  Michael Benedikt,et al.  How Can Reasoners Simplify Database Querying (And Why Haven't They Done It Yet)? , 2018, PODS.

[307]  R. Aumann,et al.  Endogenous Formation of Links Between Players and of Coalitions: An Application of the Shapley Value , 2003 .

[308]  Jeremy Clark,et al.  Bitcoin's academic pedigree , 2017, ACM Queue.

[309]  Thomas Lukasiewicz,et al.  Ontology-Mediated Query Answering over Log-Linear Probabilistic Data (Abstract) , 2019, Description Logics.

[310]  Val Tannen,et al.  Models for Incomplete and Probabilistic Information , 2006, IEEE Data Eng. Bull..

[311]  Thomas Schwentick,et al.  Parallel-Correctness and Containment for Conjunctive Queries with Union and Negation , 2016, ICDT.

[312]  Martín Ugarte,et al.  A Formal Framework for Complex Event Processing , 2019, ICDT.

[313]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[314]  Martin Shubik,et al.  A Method for Evaluating the Distribution of Power in a Committee System , 1954, American Political Science Review.

[315]  Marko Vukolic,et al.  The Next 700 BFT Protocols , 2015, ACM Trans. Comput. Syst..

[316]  Raymond Reiter On Closed World Data Bases , 1977, Logic and Data Bases.

[317]  Carsten Binnig,et al.  BlockchainDB - Towards a Shared Database on Blockchains , 2019, SIGMOD Conference.

[318]  Atri Rudra,et al.  Joins via Geometric Resolutions: Worst-case and Beyond , 2014, PODS.

[319]  Robert B. Ross,et al.  Aggregate operators in probabilistic databases , 2005, JACM.

[320]  Christoph Koch,et al.  PIP: A database system for great and small expectations , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[321]  Alessandro Margara,et al.  Processing flows of information: From data stream to complex event processing , 2012, CSUR.

[322]  Shashi M. Srivastava,et al.  A Course on Borel Sets , 1998, Graduate texts in mathematics.

[323]  Neil Immerman,et al.  Dyn-FO: A Parallel, Dynamic Complexity Class , 1997, J. Comput. Syst. Sci..

[324]  Jeffrey F. Naughton,et al.  Exploiting Data Partitioning To Provide Approximate Results , 2018, BeyondMR@SIGMOD.

[325]  Sherif Sakr,et al.  Stream Processing Languages in the Big Data Era , 2018, SIGMOD Rec..

[326]  Tilmann Rabl,et al.  BlockJoin: Efficient Matrix Partitioning Through Joins , 2017, Proc. VLDB Endow..

[327]  J. Pearl,et al.  Logical and Algorithmic Properties of Conditional Independence and Graphical Models , 1993 .

[328]  Luc De Raedt,et al.  Statistical Relational Artificial Intelligence: Logic, Probability, and Computation , 2016, Statistical Relational Artificial Intelligence.

[329]  Markus L. Schmid Characterising REGEX languages by regular languages equipped with factor-referencing , 2016, Inf. Comput..

[330]  Berthold Reinwald,et al.  On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML , 2018, Proc. VLDB Endow..

[331]  Amir Shaikhha,et al.  DBToaster: higher-order delta processing for dynamic, frequently fresh views , 2012, The VLDB Journal.

[332]  Miguel Correia,et al.  Efficient Byzantine Fault-Tolerance , 2013, IEEE Transactions on Computers.

[333]  Thomas Schwentick,et al.  Distribution Constraints: The Chase for Distributed Data , 2020, ICDT.

[334]  J. E. Moyal The general theory of stochastic population processes , 1962 .

[335]  Marc Gyssens,et al.  On the completeness of the semigraphoid axioms for deriving arbitrary from saturated conditional independence statements , 2014, Inf. Process. Lett..

[336]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[337]  Gonzalo Navarro,et al.  Compact representation of Web graphs with extended functionality , 2014, Inf. Syst..

[338]  Leonid Libkin,et al.  Elements of Finite Model Theory , 2004, Texts in Theoretical Computer Science.

[339]  Frank Neven,et al.  Datalog Queries Distributing over Components , 2017, ACM Trans. Comput. Log..

[340]  Paolo Papotti,et al.  RuleMiner: Data quality rules discovery , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[341]  Phokion G. Kolaitis,et al.  Data exchange with arithmetic operations , 2013, EDBT '13.

[342]  Hung Q. Ngo,et al.  Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems , 2018, PODS.

[343]  Mihalis Yannakakis,et al.  Algorithms for Acyclic Database Schemes , 1981, VLDB.

[344]  Evgeny Kharlamov,et al.  Capturing continuous data and answering aggregate queries in probabilistic XML , 2011, TODS.

[345]  Zhen Zhang,et al.  A non-Shannon-type conditional inequality of information quantities , 1997, IEEE Trans. Inf. Theory.

[346]  Tomás Feder,et al.  Homomorphism closed vs. existential positive , 2003, 18th Annual IEEE Symposium of Logic in Computer Science, 2003. Proceedings..

[347]  Zhenliang Liao,et al.  Case study on initial allocation of Shanghai carbon emission trading based on Shapley value , 2015 .

[348]  Y. Shitov An improved bound for the lengths of matrix algebras , 2018, Algebra & Number Theory.

[349]  Benny Kimelfeld,et al.  Joining Extractions of Regular Expressions , 2017, PODS.

[350]  Ian Rae,et al.  F1: A Distributed SQL Database That Scales , 2013, Proc. VLDB Endow..

[351]  Philip S. Yu,et al.  A Survey of Uncertain Data Algorithms and Applications , 2009, IEEE Transactions on Knowledge and Data Engineering.

[352]  Wei Hong,et al.  Model-Driven Data Acquisition in Sensor Networks , 2004, VLDB.

[353]  John Grant,et al.  Measuring inconsistency in knowledgebases , 2006, Journal of Intelligent Information Systems.

[354]  Babak Salimi,et al.  From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back , 2014, Theory of Computing Systems.

[355]  Hidehiko Tanaka,et al.  An Overview of The System Software of A Parallel Relational Database Machine GRACE , 1986, VLDB.

[356]  Christoph Koch,et al.  On Query Algebras for Probabilistic Databases , 2009, SGMD.