Micropreemption synthesis: an enabling mechanism for multitask VLSI systems

Task preemption is a critical enabling mechanism in multitask very large scale integration (VLSI) systems. On preemption, data in the register files must be preserved for the task to be resumed. This entails extra memory to preserve the context and additional clock cycles to save and restore the context. In this paper, techniques and algorithms to incorporate micropreemption constraints during multitask VLSI system synthesis are presented. Specifically, algorithms to insert and refine preemption points in scheduled task graphs subject to preemption latency constraints, techniques to minimize the context switch overhead by considering the dedicated registers required to save the state of a task on preemption and the shared registers required to save the remaining values in the tasks, and a controller-based scheme to preclude the preemption-related performance degradation by: 1) partitioning the states of a task into critical sections; 2) executing the critical sections atomically; and 3) preserving atomicity by rolling forward to the end of the critical sections on preemption have been developed. The effectiveness of all approaches, algorithms, and software implementations is demonstrated on real examples. Validation of all the results is complete in the sense that functional simulation is conducted to complete layout implementation.

[1]  Dean M. Tullsen,et al.  Initial observations of the simultaneous multithreading Pentium 4 processor , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.

[2]  G. Jack Lipovski,et al.  Parallel computing - theory and comparisons , 1987 .

[3]  Miodrag Potkonjak,et al.  Synthesis of application specific programmable processors , 1997, DAC.

[4]  Larry L. Peterson,et al.  Implementing Atomic Sequences on Uniprocessors Using Rollforward , 1996, Softw. Pract. Exp..

[5]  Donald E. Thomas,et al.  The design of mixed hardware/software systems , 1996, DAC '96.

[6]  Haidar Harmanani,et al.  A data path synthesis method for self-testable designs , 1991, 28th ACM/IEEE Design Automation Conference.

[7]  Jörg Henkel,et al.  Hardware-software cosynthesis for microcontrollers , 1993, IEEE Design & Test of Computers.

[8]  Milind Girkar,et al.  Towards efficient multi-level threading of H.264 encoder on Intel hyper-threading architectures , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[9]  Anoop Gupta,et al.  Interleaving: a multithreading technique targeting multiprocessors and workstations , 1994, ASPLOS VI.

[10]  A. El Gamal,et al.  Architecture of field-programmable gate arrays , 1993, Proc. IEEE.

[11]  Anant Agarwal,et al.  APRIL: a processor architecture for multiprocessing , 1990, ISCA '90.

[12]  Daniel P. Lopresti,et al.  SPLASH: A Reconfigurable Linear Logic Array , 1990, ICPP.

[13]  Alice C. Parker,et al.  The high-level synthesis of digital systems , 1990, Proc. IEEE.

[14]  J. M. Rabaey,et al.  A 2.4 GOPS data-driven reconfigurable multiprocessor IC for DSP , 1995, Proceedings ISSCC '95 - International Solid-State Circuits Conference.

[15]  J. H. Patel,et al.  Use of preferred preemption points in cache-based real-time systems , 1995, Proceedings of 1995 IEEE International Computer Performance and Dependability Symposium.

[16]  R.E. Johnson,et al.  Evaluation of Multithreaded Uniprocessors for Commercial Application Environments , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[17]  A. El Gamal,et al.  Synthesis method for field programmable gate arrays , 1993, Proc. IEEE.

[18]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[19]  Gaetano Borriello,et al.  Interface co-synthesis techniques for embedded systems , 1995, ICCAD.

[20]  Richard M. Russell,et al.  The CRAY-1 computer system , 1978, CACM.

[21]  Andrew R. Pleszkun,et al.  Implementing Precise Interrupts in Pipelined Processors , 1988, IEEE Trans. Computers.

[22]  André DeHon,et al.  Reconfigurable architectures for general-purpose computing , 1996 .

[23]  Ieee Circuits,et al.  IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems information for authors , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[24]  Jianwen Zhu,et al.  Specification and Design of Embedded Systems , 1998, Informationstechnik Tech. Inform..

[25]  Miodrag Potkonjak,et al.  Computer Aided Design of Fault-Tolerant Application Specific Programmable Processors , 2000, IEEE Trans. Computers.

[26]  Edward A. Lee,et al.  Static Scheduling of Synchronous Data Flow Programs for Digital Signal Processing , 1989, IEEE Transactions on Computers.

[27]  Jonathan Rose,et al.  Synthesis methods for field programmable gate arrays , 1993 .

[28]  Robert J. Baron,et al.  Computer Architecture; Case Studies , 1992 .

[29]  Jack L. Lo,et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[30]  C. C. Stearns,et al.  A reconfigurable 64-tap transversal filter , 1988, Proceedings of the IEEE 1988 Custom Integrated Circuits Conference.

[31]  S SohiGurindar Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers , 1990 .

[32]  Miodrag Potkonjak,et al.  Fast prototyping of datapath-intensive architectures , 1991, IEEE Design & Test of Computers.

[33]  Miodrag Potkonjak,et al.  Heterogeneous built-in resiliency of application specific programmable processors , 1996, Proceedings of International Conference on Computer Aided Design.

[34]  Rajesh K. Gupta,et al.  An algorithm for synthesis of system-level interface circuits , 1996, Proceedings of International Conference on Computer Aided Design.

[35]  S. Storino,et al.  A commercial multithreaded RISC processor , 1998, 1998 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, ISSCC. First Edition (Cat. No.98CH36156).

[36]  Allan Porterfield,et al.  The Tera computer system , 1990, ICS '90.

[37]  Miodrag Potkonjak,et al.  An Approach For Power Minimization Using Transformations , 1992, Workshop on VLSI Signal Processing.

[38]  Kwei-Jay Lin,et al.  Enhancing the real-time capability of the Linux kernel , 1998, Proceedings Fifth International Conference on Real-Time Computing Systems and Applications (Cat. No.98EX236).

[39]  Christos A. Papachristou,et al.  A register file and scheduling model for application specific processor synthesis , 1996, DAC '96.

[40]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[41]  G.D. Hillman DSP56200: An algorithm-specific digital signal processor peripheral , 1987, Proceedings of the IEEE.

[42]  David W. Anderson,et al.  The IBM System/360 model 91: machine philosophy and instruction-handling , 1967 .

[43]  Yale N. Patt,et al.  Checkpoint Repair for High-Performance Out-of-Order Execution Machines , 1987, IEEE Transactions on Computers.