An online parallel CRC32 realization for Hybrid Memory Cube protocol

Hybrid Memory Cube (HMC) is a revolutionary standard in DRAM architecture based on 3D integration. It provides marvelous concurrency and reduced latency. HMC uses CRC32 for data integrity, but conventional Serial CRC calculation is very slow and has long latencies, here we propose three methods to implement parallel CRC to be very fast. The first method uses symbolic toolbox in MATLAB to generate the final equations of the CRC, and then these equations are exported to VERILOG so that we are able to calculate it in only one clock cycle. The second method is depending on using an existing tool that can generate parallel CRC but this tool has a limitation on the input data width as it is less than the maximum allowed data width in HMC which is 1152 bits, so we were able to find a work around method that enable us to calculate CRC32 for large data widthwith this tool. The third method is based on using the polynomial mathematics for CRC, as the CRC can be calculated using long division method. Method 1 latency is one clock cycle, Method 2 latency is 2 clock cycles, and method 3 latency is 37 clock cycles compared to serial CRC which latency is 1152 clock cycles.

[1]  Michael E. Kounavis,et al.  Novel Table Lookup-Based Algorithms for High-Performance CRC Generation , 2008, IEEE Transactions on Computers.

[2]  Dafna Sheinwald,et al.  Out of order incremental CRC computation , 2005, IEEE Transactions on Computers.

[3]  Mathys Walma Pipelined Cyclic Redundancy Check (CRC) Calculation , 2007, 2007 16th International Conference on Computer Communications and Networks.

[4]  Giuseppe Patanè,et al.  Parallel CRC Realization , 2003, IEEE Trans. Computers.

[5]  P. Larsson-Edefors,et al.  VLSI implementation of CRC-32 for 10 Gigabit Ethernet , 2001, ICECS 2001. 8th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.01EX483).