Systolic array based VLSI architecture for high throughput 2-D discrete wavelet transform

A new data scan method is proposed for 2-D discrete wavelet transform to access more pixels in one clock cycle. Unlike existing stripe based method, in our design adjacent even and odd rows are read and processed at the same time. The concurrent output from even and odd row transform units inherently eliminate the data sequencing between row transform and column transform. Thus the transposition memory is not needed any more. For the row transform unit, a novel systolic array structure is constructed with pipeline technique employed to reduce the critical path delay. Without too many additional registers, the improved critical path delay of Tm is superior to most of the stripe based designs. For the column transform unit, a conventional lifting based two-input/two-output structure is adopted. Theoretical analysis shows that this design is suitable for applications which have demanding throughput rate and high operation frequency requirements. Synthesis results in UMC 130nm process show that the Area Delay Product is 23%, 27.3% and 29.6% better than the best existing stripe based structure for S=2, 4 and 8.