StreamIt: A Language for Streaming Applications

We characterize high-performance streaming applications as a new and distinct domain of programs that is becoming increasingly important. The StreamIt language provides novel high-level representations to improve programmer productivity and program robustness within the streaming domain. At the same time, the StreamIt compiler aims to improve the performance of streaming applications via stream-specific analyses and optimizations. In this paper, we motivate, describe and justify the language features of StreamIt, which include: a structured model of streams, a messaging system for control, a re-initialization mechanism, and a natural textual syntax.

[1]  Mark Horowitz,et al.  Signal Delay in RC Tree Networks , 1983, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[2]  Gerald J. Sussman,et al.  Structure and interpretation of computer programs , 1985, Proceedings of the IEEE.

[3]  John L. Hennessy,et al.  Performance directed memory hierarchy design , 1988 .

[4]  Kimming So,et al.  Cache Operations by MRU Change , 1988, IEEE Trans. Computers.

[5]  Richard E. Kessler,et al.  Inexpensive Implementations Of Set-Associativity , 1989, The 16th Annual International Symposium on Computer Architecture.

[6]  Hsiao-Wuen Hon,et al.  An overview of the SPHINX speech recognition system , 1990, IEEE Trans. Acoust. Speech Signal Process..

[7]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[8]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[9]  Pascal Raymond,et al.  The synchronous data flow programming language LUSTRE , 1991, Proc. IEEE.

[10]  Per Stenström,et al.  On reconfigurable on-chip data caches , 1991, MICRO 24.

[11]  Thomas Haug,et al.  The GSM System for Mobile Communications , 1992 .

[12]  Gérard Berry,et al.  The Esterel Synchronous Programming Language: Design, Semantics, Implementation , 1992, Sci. Comput. Program..

[13]  Richard Eugene Kessler Analysis of multi-megabyte secondary CPU cache memories , 1992 .

[14]  David A. Wood,et al.  A Comparison of Trace-Sampling Techniques for Multi-Megabyte Caches , 1994, IEEE Trans. Computers.

[15]  Norman P. Jouppi,et al.  WRL Research Report 93/5: An Enhanced Access and Cycle Time Model for On-chip Caches , 1994 .

[16]  François Bodin,et al.  Skewed associativity enhances performance predictability , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[17]  M. Valero,et al.  A data cache with multiple caching strategies tuned to different types of locality , 1995, ICS '95.

[18]  Antonio Gonzalez,et al.  A data cache with multiple caching strategies tuned to different types of locality , 1995, International Conference on Supercomputing.

[19]  Gary S. Tyson,et al.  A modified approach to data cache management , 1995, MICRO 1995.

[20]  Norman P. Jouppi,et al.  CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.

[21]  Guy L. Steele,et al.  The Java Language Specification , 1996 .

[22]  Edward A. Lee,et al.  Software Synthesis from Dataflow Graphs , 1996 .

[23]  David L. Tennenhouse,et al.  The SpectrumWare approach to wireless signal processing , 1996, Wirel. Networks.

[24]  Vivek Sarkar,et al.  Baring it all to Software: The Raw Machine , 1997 .

[25]  Robert Stephens,et al.  A survey of stream processing , 1997, Acta Informatica.

[26]  Kunle Olukotun,et al.  Designing High Bandwidth On-Chip Caches , 1997, ISCA.

[27]  Wen-mei W. Hwu,et al.  Run-Time Adaptive Cache Hierarchy Management via Reference Analysis , 1997, ISCA.

[28]  Doug Matzke,et al.  Will Physical Scalability Sabotage Performance Gains? , 1997, Computer.

[29]  William J. Dally,et al.  A bandwidth-efficient architecture for media processing , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[30]  John V. Guttag,et al.  Virtual radios , 1999, IEEE J. Sel. Areas Commun..

[31]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[32]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor , 1999, IEEE Micro.

[33]  Itu-T Specification and Description Language (SDL) , 1999 .

[34]  J. Covino,et al.  An 833-MHz 1.5-W 18-Mb CMOS SRAM with 1.67 Gb/s/pin , 2000, IEEE Journal of Solid-State Circuits.

[35]  William J. Dally,et al.  Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.

[36]  Steven K. Reinhardt,et al.  A fully associative software-managed cache design , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[37]  EDDIE KOHLER,et al.  The click modular router , 2000, TOCS.

[38]  J. Lachman,et al.  A 900 MHz 2.25 MB cache with on-chip CPU now in Cu SOI , 2001, 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177).

[39]  Norman P. Jouppi,et al.  Cacti 3. 0: an integrated cache timing, power, and area model , 2001 .

[40]  Kaushik Roy,et al.  Reducing set-associative cache energy via way-prediction and selective direct-mapping , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[41]  Ken Mai,et al.  The future of wires , 2001, Proc. IEEE.

[42]  Karthikeyan Sankaralingam,et al.  A Technology-Scalable Architecture for Fast Clocks and High ILP , 2001 .

[43]  Edward A. Lee,et al.  Overview of the Ptolemy project , 2001 .

[44]  Jaehyuk Huh,et al.  Exploring the design space of future CMPs , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[45]  Henry Hoffmann,et al.  StreamIt: A Compiler for Streaming Applications ⁄ , 2002 .

[46]  The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.