Stream Types

We propose a rich foundational theory of typed data streams and stream transformers, motivated by two high-level goals: (1) the type of a stream should be able to express complex sequential patterns of events over time, and (2) it should describe the parallel structure of the stream to enable deterministic stream processing on parallel and distributed systems. To this end, we introduce stream types, with operators capturing sequential composition, parallel composition, and iteration, plus a core calculus of transformers over typed streams which naturally supports a number of common streaming idioms, including punctuation, windowing, and parallel partitioning, as first-class constructions. The calculus exploits a Curry-Howard-like correspondence with an ordered variant of the logic of Bunched Implication to program with streams compositionally and uses Brzozowski-style derivatives to enable an incremental, event-based operational semantics. To validate our design, we provide a reference interpreter and machine-checked proofs of the main results.

[1]  Rasmus Ejlers Møgelberg,et al.  Asynchronous Modal FRP , 2023, Proc. ACM Program. Lang..

[2]  Yu David Liu,et al.  The essence of online data processing , 2022, Proc. ACM Program. Lang..

[3]  Jorge A. Pérez,et al.  A bunch of sessions: a propositions-as-sessions interpretation of bunched implications in channel-based concurrency , 2022, Proc. ACM Program. Lang..

[4]  D. Kozen,et al.  Concurrent NetKAT: Modeling and analyzing stateful, concurrent networks , 2022, ESOP.

[5]  Dan Frumin,et al.  Semantic cut elimination for the logic of bunched implications, formalized in Coq , 2021, CPP.

[6]  Rajeev Alur,et al.  Synchronization Schemas , 2021, PODS.

[7]  Rajeev Alur,et al.  Stream processing with dependency-guided synchronization , 2021, PPoPP.

[8]  Rajeev Alur,et al.  DiffStream: differential output testing for stream processing programs , 2020, Proc. ACM Program. Lang..

[9]  Rasmus Ejlers Møgelberg,et al.  Diamonds are not forever: liveness in reactive programming with guarded recursion , 2020, Proc. ACM Program. Lang..

[10]  Rajeev Alur,et al.  Data-trace types for distributed stream processing systems , 2019, PLDI.

[11]  Kenneth Knowles,et al.  One SQL to Rule Them All - an Efficient and Syntactically Idiomatic Approach to Management of Streams and Tables , 2019, SIGMOD Conference.

[12]  Rasmus Ejlers Møgelberg,et al.  Simply RaTT: a fitch-style modal calculus for reactive programming without space leaks , 2019, Proc. ACM Program. Lang..

[13]  Bas Luttik,et al.  On Series-Parallel Pomset Languages: Rationality, Context-Freeness and Automata , 2018, J. Log. Algebraic Methods Program..

[14]  Indranil Gupta,et al.  Stateful Scalable Stream Processing at LinkedIn , 2017, Proc. VLDB Endow..

[15]  Nobuko Yoshida,et al.  Multiparty asynchronous session types , 2008, POPL '08.

[16]  Kun-Lung Wu,et al.  Safe Data Parallelism for General Streaming , 2015, IEEE Transactions on Computers.

[17]  Robert Grimm,et al.  A catalog of stream processing optimizations , 2014, ACM Comput. Surv..

[18]  Nate Foster,et al.  NetKAT: semantic foundations for networks , 2014, POPL.

[19]  Prakash Panangaden,et al.  Fair reactive programming , 2014, POPL.

[20]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[21]  Neelakantan R. Krishnaswami,et al.  Higher-order functional reactive programming without spacetime leaks , 2013, ICFP.

[22]  Luigi Santocanale,et al.  Cuts for circular proofs: semantics and cut-elimination , 2013, CSL.

[23]  Rajeev Goré,et al.  A Labelled Sequent Calculus for BBI: Proof Theory and Proof Search , 2013, TABLEAUX.

[24]  Alan Jeffrey,et al.  LTL types FRP: linear-time temporal logic propositions as types, proofs as functional reactive programs , 2012, PLPV '12.

[25]  Robert Grimm,et al.  A Universal Calculus for Stream Processing Languages , 2010, ESOP.

[26]  Georg Struth,et al.  Concurrent Kleene Algebra , 2009, CONCUR.

[27]  Jennifer Widom,et al.  Towards a streaming SQL standard , 2008, Proc. VLDB Endow..

[28]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[29]  James Brotherston,et al.  Cyclic Proofs for First-Order Logic with Inductive Definitions , 2005, TABLEAUX.

[30]  Theodore Johnson,et al.  A Heartbeat Mechanism and Its Application in Gigascope , 2005, VLDB.

[31]  Jennifer Widom,et al.  CQL: A Language for Continuous Queries over Streams and Relations , 2003, DBPL.

[32]  Giuseppe Castagna,et al.  CDuce: an XML-centric general-purpose language , 2003, ICFP '03.

[33]  Frederick Reiss,et al.  TelegraphCQ: continuous dataflow processing , 2003, SIGMOD '03.

[34]  David Maier,et al.  Exploiting Punctuation Semantics in Continuous Data Streams , 2003, IEEE Trans. Knowl. Data Eng..

[35]  Stephen A. Edwards,et al.  The synchronous languages 12 years later , 2003, Proc. IEEE.

[36]  John C. Reynolds,et al.  Separation logic: a logic for shared mutable data structures , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[37]  Giuseppe Castagna,et al.  Semantic subtyping , 2002, Proceedings 17th Annual IEEE Symposium on Logic in Computer Science.

[38]  Samuel Madden,et al.  Continuously adaptive continuous queries over streams , 2002, SIGMOD '02.

[39]  William Thies,et al.  StreamIt: A Language for Streaming Applications , 2002, CC.

[40]  Benjamin C. Pierce,et al.  Regular expression types for XML , 2000, TOPL.

[41]  Peter W. O'Hearn,et al.  The Logic of Bunched Implications , 1999, Bulletin of Symbolic Logic.

[42]  Paul Hudak,et al.  Functional reactive animation , 1997, ICFP '97.

[43]  Robert Stephens,et al.  A survey of stream processing , 1997, Acta Informatica.

[44]  E.A. Lee,et al.  Synchronous data flow , 1987, Proceedings of the IEEE.

[45]  Janusz A. Brzozowski,et al.  Derivatives of Regular Expressions , 1964, JACM.

[46]  Jennifer Widom,et al.  STREAM: The Stanford Data Stream Management System , 2016, Data Stream Management.

[47]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[48]  Anca Muscholl,et al.  Trace Theory , 2011, Encyclopedia of Parallel Computing.

[49]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[50]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[51]  Wojciech Zielonka,et al.  The Book of Traces , 1995 .

[52]  Fred Kröger,et al.  Temporal Logic of Programs , 1987, EATCS Monographs on Theoretical Computer Science.

[53]  William H. Burge,et al.  Stream Processing Functions , 1975, IBM J. Res. Dev..

[54]  Gilles Kahn,et al.  The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.