SFL: A Structured Dataflow Language Based on SQL and FP

SFL (pronounced as Sea-Flow) is an analytics system that supports a declarative language that extends SQL for specifying the dataflow of data-intensive analytics. The extended SQL language is motivated by providing a top-level representation of the converged platform for analytics and data management. Due to fast data access and reduced data transfer, such convergence has become the key to speed up and scale up data intensive BI applications. A SFL query is constructed from conventional queries in terms of Function Forms (FFs). While a conventional SQL query represents a dataflow tree, a SFL query represents a more general dataflow graph. We support SFL query execution by tightly integrating it with the evaluation of its component queries to minimize the overhead of data retrieval, copying, moving and buffering, which actually turns a query engine to a generalized dataflow engine. The experimental results based on a prototype built by extending the PostgreSQL engine are discussed.