Prevent Deadlock and Remove Blocking for Self-Timed Systems

In the design of distributed embedded systems, designers face two problems: how to prevent deadlock and how to improve performance. An accurate model providing abstractions for functionality and performance is important to solve these problems. Self-timed system model that conducts communications based on handshaking protocols is suitable to model these distributed embedded systems. This paper studies the fundamental properties of self-timed systems and proposes solutions of the above two problems. First, we present the necessary and sufficient conditions for a self-timed system constructed from an application to incur deadlocks; then we propose approaches to prevent any deadlocks in constructing self-timed systems. Second, we observe that the different pace of data progressing on two paths, having common source/destination nodes, may cause blocking events (not deadlock) which dramatically degrade the system performance. We establish theorems to detect blocking events and design Mixed-Integer Linear Programming (MILP) formulas to eliminate these events. Experimental results show that most self-timed systems constructed by a straightforward approach incur possible deadlocks, while our proposed methods guarantee no deadlocks. Furthermore, our proposed techniques to eliminate blocking events achieve 48.23 % performance improvements on average, compared with the straightforward approach.

[1]  Ted E. Williams Performance of iterative computation in self-timed rings , 1994, J. VLSI Signal Process..

[2]  Montek Singh,et al.  Automated Microarchitectural Exploration for Achieving Throughput Targets in Pipelined Asynchronous Systems , 2010, 2010 IEEE Symposium on Asynchronous Circuits and Systems.

[3]  Edwin Hsing-Mean Sha,et al.  Properties of Self-Timed Ring Architectures for Deadlock-Free and Consistent Configuration Reaching Maximum Throughput , 2016, J. Signal Process. Syst..

[4]  Mark Russell Greenstreet,et al.  Stari: a technique for high-bandwidth communication , 1993 .

[5]  Wayne H. Wolf,et al.  TGFF: task graphs for free , 1998, Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE'98).

[6]  D. Kroft,et al.  All paths through a maze , 1967 .

[7]  Steven Burns Performance Analysis and Optimization of Asynchronous Circuits , 1991 .

[8]  Edwin Hsing-Mean Sha,et al.  Scheduling Data-Flow Graphs via Retiming and Unfolding , 1997, IEEE Trans. Parallel Distributed Syst..

[9]  Peter A. Beerel,et al.  Slack matching asynchronous designs , 2006, 12th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC'06).

[10]  Kurt Keutzer,et al.  Efficient Parallelization of H.264 Decoding with Macro Block Level Scheduling , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[11]  Edwin Hsing-Mean Sha,et al.  On self-timed ring for consistent mapping and maximum throughput , 2014, 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications.

[12]  Marilyn Wolf High-Performance Embedded Computing: Applications in Cyber-Physical Systems and Mobile Computing , 2014 .

[13]  Donald B. Johnson,et al.  Finding All the Elementary Circuits of a Directed Graph , 1975, SIAM J. Comput..

[14]  Alexander Taubin,et al.  Heuristic Based throughput Analysis and Optimization of Asynchronous Pipelines , 2009, 2009 15th IEEE Symposium on Asynchronous Circuits and Systems.

[15]  Steven M. Nowick,et al.  Applications of asynchronous circuits , 1999, Proc. IEEE.

[16]  Sander Stuijk,et al.  SDF^3: SDF For Free , 2006, Sixth International Conference on Application of Concurrency to System Design (ACSD'06).

[17]  Keshab K. Parhi,et al.  Static Rate-Optimal Scheduling of Iterative Data-Flow Programs via Optimum Unfolding , 1991, IEEE Trans. Computers.