Abstract Despite further refinements of the CORDIC algorithm with the introduction of redundant arithmetic and higher radix CORDIC techniques, in terms of circuit latency and performance, the iterative nature remains to be the major bottleneck for further optimization. A technique known as flat CORDIC, in which the conventional X and Y recurrences are successively substituted to express the final vectors in terms of the initial vectors, can be used to eliminate the iterative process. In this paper, the techniques devised for the VLSI efficient implementation of a pipelined 16-bit flat CORDIC based sine–cosine generator are presented. Three possible schemes to pipeline the 16-bit flat CORDIC design have been presented to demonstrate the suitability of the proposed method to realize high throughput implementations. The 16-bit architecture has been synthesized with 0.35 μ CMOS process library using Synopsys. Finally, a detailed comparison with other major contributions show that the flat CORDIC based sine–cosine generators are, on average, 30% faster and occupy some 30% less silicon area.
[1]
Heinrich Meyr,et al.
The Differential CORDIC Algorithm: Constant Scale Factor Redundant Implementation without Correcting Iterations
,
1996,
IEEE Trans. Computers.
[2]
J. S. Walther,et al.
A unified algorithm for elementary functions
,
1899,
AFIPS '71 (Spring).
[3]
Shuzo Yajima,et al.
Redundant CORDIC Methods with a Constant Scale Factor for Sine and Cosine Computation
,
1991,
IEEE Trans. Computers.
[4]
Dirk Timmermann,et al.
Low Latency Time CORDIC Algorithms
,
1992,
IEEE Trans. Computers.
[5]
Javier D. Bruguera,et al.
High Performance Rotation Architectures Based on the Radix-4 CORDIC Algorithm
,
1997,
IEEE Trans. Computers.