Decentralized dynamic resource management support for massively parallel processor arrays

This paper presents a hardware-supported resource management methodology for massively parallel processor arrays. It enables processing elements to autonomously explore resource availability in their neighborhood. To support resource exploration, we introduce specialized controllers, which can be attached to each of the processing elements. We propose different types of architectures for the exploration controller: fast FSM-based designs as well as flexible programmable controllers. These controllers allow to implement different distributed resource exploration strategies in order to enable parallel programs the exploration and reservation of available resources according to different application requirements. Hardware cost evaluations show that the cost of the simplest implementation of our programmable controller is comparable to our FSM-based implementations, while offering the flexibility for implementing different exploration strategies. We show that the proposed distributed approach can achieve a significant speedup in comparison with centralized resource exploration methods.

[1]  Simha Sethumadhavan,et al.  Distributed Microarchitectural Protocols in the TRIPS Prototype Processor , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[2]  Olivier Temam,et al.  A Practical Approach for Reconciling High and Predictable Performance in Non-Regular Parallel Programs , 2008, 2008 Design, Automation and Test in Europe.

[3]  Guang Sun,et al.  Energy-aware run-time mapping for homogeneous NoC , 2010, 2010 International Symposium on System on Chip.

[4]  Olivier Temam,et al.  CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[5]  Abdulazim Amouri,et al.  FSM-controlled architectures for linear invasion , 2009, 2009 17th IFIP International Conference on Very Large Scale Integration (VLSI-SoC).

[6]  Steven S. Beauchemin,et al.  The computation of optical flow , 1995, CSUR.

[7]  Li Shang,et al.  Hardware-software co-synthesis of low power real-time distributed embedded systems with dynamically reconfigurable FPGAs , 2002, Proceedings of ASP-DAC/VLSI Design 2002. 7th Asia and South Pacific Design Automation Conference and 15h International Conference on VLSI Design.

[8]  Jürgen Teich,et al.  Distributed Resource Reservation in Massively Parallel Processor Arrays , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[9]  Jürgen Teich,et al.  A highly parameterizable parallel processor array architecture , 2006, 2006 IEEE International Conference on Field Programmable Technology.

[10]  Bjorn De Sutter,et al.  Architecture Enhancements for the ADRES Coarse-Grained Reconfigurable Array , 2008, HiPEAC.

[11]  Francky Catthoor,et al.  A hybrid prefetch scheduling heuristic to minimize at run-time the reconfiguration overhead of dynamically reconfigurable hardware [multimedia applications] , 2005, Design, Automation and Test in Europe.

[12]  Klaus D. Müller-Glaser,et al.  MORPHEUS: Heterogeneous Reconfigurable Computing , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[13]  Xin Zhao,et al.  An ILP formulation for task mapping and scheduling on multi-core architectures , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[14]  Fadi J. Kurdahi,et al.  Configuration management in multi-context reconfigurable systems for simultaneous performance and power optimizations , 2000, ISSS '00.

[15]  Markus Weinhardt,et al.  PACT XPP—A Self-Reconfigurable Data Processing Architecture , 2004, The Journal of Supercomputing.