Synthesis from multi-cycle atomic actions as a solution to the timing closure problem

One solution to the timing closure problem is to perform infrequent operations in more than one cycle. Despite simplicity of the solution statement, it is not easily considered because it requires changes in RTL, which, in turn, exacerbates the verification problem. We offer a timing closure solution guaranteed to preserve functional correctness of designs expressed using atomic actions or rules. We exploit the fact that the semantics of atomic actions are untimed, that is, the time to execute an action is not specified. The current hardware synthesis technique from atomic actions assumes that each rule takes one clock cycle to complete its computation. Consequently, the rule with the longest combinational path determines the clock cycle of the entire design, often leading to needlessly slow circuits. We present a synthesis procedure for a system where the combinational circuits embodied in a rule can take multiple cycles without changing the semantics of the original design. We also present preliminary results based on an experimental compiler which uses the Bluespec (BSV) compiler front end and generates Verilog. The results show that the clock speed and the performance of circuits can be improved substantially by allowing slow paths to complete over multiple cycles. Our technique is orthogonal to solutions based on multiple clock domains.

[1]  Michel Dubois,et al.  Memory access buffering in multiprocessors , 1998, ISCA '98.

[2]  Maurice Herlihy,et al.  Software transactional memory for dynamic-sized data structures , 2003, PODC '03.

[3]  Josep Torrellas,et al.  Bulk Disambiguation of Speculative Threads in Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[4]  Daniel L. Rosenband Hardware synthesis from guarded atomic actions with performance specifications , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[5]  Steven M. Nowick,et al.  An introduction to asynchronous circuit design , 1998 .

[6]  Arvind,et al.  Scheduling as Rule Composition , 2007, 2007 5th IEEE/ACM International Conference on Formal Methods and Models for Codesign (MEMOCODE 2007).

[7]  J. E. Thornton,et al.  Parallel operation in the control data 6600 , 1964, AFIPS '64 (Fall, part II).

[8]  Nir Shavit,et al.  Software transactional memory , 1995, PODC '95.

[9]  Kunle Olukotun,et al.  An effective hybrid transactional memory system with strong isolation guarantees , 2007, ISCA '07.

[10]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[11]  Arvind,et al.  Modular scheduling of guarded atomic actions , 2004, Proceedings. 41st Design Automation Conference, 2004..

[12]  James C. Hoe,et al.  Synthesis of operation-centric hardware descriptions , 2000, IEEE/ACM International Conference on Computer Aided Design. ICCAD - 2000. IEEE/ACM Digest of Technical Papers (Cat. No.00CH37140).