The declarative imperative: experiences and conjectures in distributed logic

The rise of multicore processors and cloud computing is putting enormous pressure on the software community to find solutions to the difficulty of parallel and distributed programming. At the same time, there is more--and more varied--interest in data-centric programming languages than at any time in computing history, in part because these languages parallelize naturally. This juxtaposition raises the possibility that the theory of declarative database query languages can provide a foundation for the next generation of parallel and distributed programming languages. In this paper I reflect on my group's experience over seven years using Datalog extensions to build networking protocols and distributed systems. Based on that experience, I present a number of theoretical conjectures that may both interest the database community, and clarify important practical issues in distributed computing. Most importantly, I make a case for database researchers to take a leadership role in addressing the impending programming crisis. This is an extended version of an invited lecture at the ACM PODS 2010 conference [32].

[1]  Daisy Zhe Wang,et al.  Querying probabilistic information extraction , 2010, Proc. VLDB Endow..

[2]  Joseph M. Hellerstein,et al.  I do declare: consensus in a logic language , 2010, OPSR.

[3]  David Maier,et al.  Dedalus: Datalog in Time and Space , 2010, Datalog.

[4]  Robbert van Renesse,et al.  Toward a cloud computing research agenda , 2009, SIGA.

[5]  Butler W. Lampson Getting computers to understand , 2003, JACM.

[6]  Gurmeet Singh Manku,et al.  Symphony: Distributed Hashing in a Small World , 2003, USENIX Symposium on Internet Technologies and Systems.

[7]  Daisy Zhe Wang,et al.  Probabilistic declarative information extraction , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[8]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[9]  Noah A. Smith,et al.  Dyna: a declarative language for implementing dynamic programs , 2004, ACL 2004.

[10]  Ion Stoica,et al.  Declarative networking: language, execution and optimization , 2006, SIGMOD Conference.

[11]  David Chu,et al.  Evita raced: metacompilation for declarative networks , 2008, Proc. VLDB Endow..

[12]  David Chu,et al.  Automating rendezvous and proxy selection in sensornets , 2009, 2009 International Conference on Information Processing in Sensor Networks.

[13]  Christos H. Papadimitriou Database metatheory: asking the big queries , 1995, PODS '95.

[14]  Seth Copen Goldstein,et al.  Meld: A declarative approach to programming ensembles , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Ion Stoica,et al.  Declarative networking , 2009, Commun. ACM.

[16]  Trevor Jim,et al.  SD3: a trust management system with certified evaluation , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[17]  Srinivasan Seshan,et al.  Synopsis diffusion for robust aggregation in sensor networks , 2004, SenSys '04.

[18]  Jim Gray,et al.  What next?: A dozen information-technology research goals , 1999, JACM.

[19]  Joseph M. Hellerstein,et al.  The design and implementation of declarative networks , 2006 .

[20]  Serge Abiteboul,et al.  Diagnosis of asynchronous discrete event systems: datalog to the rescue! , 2005, PODS.

[21]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[22]  Bertram Ludäscher,et al.  On Active Deductive Databases: The Statelog Approach , 1996, Transactions and Change in Logic Databases.

[23]  David A. McAllester,et al.  The Generalized A* Architecture , 2007, J. Artif. Intell. Res..

[24]  Kenneth A. Ross A Syntactic Stratification Condition Using Constraints , 1994, ILPS.

[25]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[26]  Joseph M. Hellerstein Datalog redux: experience and conjecture , 2010, PODS '10.

[27]  John C. Mitchell,et al.  A Compositional Logic for Proving Security Properties of Protocols , 2003, J. Comput. Secur..

[28]  Ashima Atul,et al.  Compact Implementation of Distributed Inference Algorithms for Network , 2009 .

[29]  Katherine A. Morris,et al.  An algorithm for ordering subgoals in NAIL? , 1988, PODS.

[30]  Wei Hong,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tag: a Tiny Aggregation Service for Ad-hoc Sensor Networks , 2022 .

[31]  Joseph M. Hellerstein,et al.  Using state modules for adaptive query processing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[32]  Ion Stoica,et al.  Implementing declarative overlays , 2005, SOSP '05.

[33]  Sergei Vassilvitskii,et al.  A model of computation for MapReduce , 2010, SODA '10.

[34]  David E. Culler,et al.  SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.

[35]  Ondrej Lhoták,et al.  Jedd: a BDD-based relational extension of Java , 2004, PLDI '04.

[36]  Johannes Gehrke,et al.  Declarative processing for computer games , 2008, Sandbox '08.

[37]  John G. Cleary,et al.  An Operational Semantics of Starlog , 1999, PPDP.

[38]  Andrey Rybalchenko,et al.  Operational Semantics for Declarative Networking , 2009, PADL.

[39]  Rajeev Motwani,et al.  Coloring Away Communication in Parallel Query Optimization , 1995, VLDB.

[40]  Teodor C. Przymusinski On the Declarative Semantics of Deductive Databases and Logic Programs , 1988, Foundations of Deductive Databases and Logic Programming..

[41]  Elnar Hajiyev,et al.  CodeQuest: querying source code with datalog , 2005, OOPSLA '05.

[42]  Jim Waldo,et al.  A Note on Distributed Computing , 1996, Mobile Object Systems.

[43]  Pat Helland,et al.  Building on Quicksand , 2009, CIDR.

[44]  Moshe Y. Vardi The complexity of relational query languages (Extended Abstract) , 1982, STOC '82.

[45]  Yun Mao On the declarativity of declarative networking , 2010, OPSR.

[46]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[47]  Joseph M. Hellerstein,et al.  BOOM: Data-Centric Programming in the Datacenter , 2009 .

[48]  Roger M. Needham,et al.  On the duality of operating system structures , 1979, OPSR.

[49]  Kyuseok Shim,et al.  Query Optimization in the Presence of Foreign Functions , 1993, VLDB.

[50]  George C. Necula,et al.  Capriccio: scalable threads for internet services , 2003, SOSP '03.

[51]  Dirk Van Gucht,et al.  Computationally Complete Relational Query Languages , 2009, Encyclopedia of Database Systems.

[52]  Joseph M. Hellerstein,et al.  Toward network data independence , 2003, SGMD.

[53]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[54]  Jeffrey D. Ullman,et al.  Optimizing joins in a map-reduce environment , 2010, EDBT '10.

[55]  Carlo Zaniolo,et al.  Logic-Based User-Defined Aggregates for the Next Generation of Database Systems , 1999, The Logic Programming Paradigm.

[56]  Scott Shenker,et al.  Enhancing P2P File-Sharing with an Internet-Scale Query Processor , 2004, VLDB.

[57]  David Chu,et al.  Building and optimizing declarative networked systems , 2009 .

[58]  Erik Meijer,et al.  Confessions of a used programming language salesman , 2007, OOPSLA.

[59]  Sergio Greco,et al.  Greedy Algorithms in Datalog with Choice and Negation , 1998, IJCSLP.

[60]  Guy M. Lman Grammar-like Functional Rules for Representing Query Optimization Alternatives , 1998 .

[61]  Roy Goldman,et al.  WSQ/DSQ: a practical approach for combined querying of databases and the Web , 2000, SIGMOD '00.

[62]  Martín Abadi,et al.  Unified Declarative Platform for Secure Netwoked Information Systems , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[63]  Philip Levis,et al.  The design and implementation of a declarative sensor network system , 2007, SenSys '07.

[64]  John Field,et al.  Reactors: A data-oriented synchronous/asynchronous programming model for distributed applications , 2007, Theor. Comput. Sci..

[65]  David Harel,et al.  Structure and complexity of relational queries , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[66]  Readings in Database Systems, Third Edition , 1998 .

[67]  Ion Stoica,et al.  Declarative routing: extensible routing with declarative queries , 2005, SIGCOMM '05.

[68]  Charlene O'Hanlon A Conversation with Jordan Cohen , 2006, ACM Queue.

[69]  Yin Zhang,et al.  STAR: Self-Tuning Aggregation for Scalable Monitoring , 2007, VLDB.

[70]  Badrish Chandramouli,et al.  On-the-fly Progress Detection in Iterative Stream Queries , 2009, Proc. VLDB Endow..

[71]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.