Low cost and latency embedded 3D graphics reciprocation

The paper presents low cost and latency reciprocation for fixed-point datapath of embedded 3D graphics accelerators. The algorithm exploits the limitations of the human visual system that allows a reasonable amount of error to be introduced in the computation process without inducing noticeable image artifacts. In the example given in the paper, excerpted from the antialiasing datapath of an embedded QVGA graphics hardware accelerator, for a 14-bit operand, the reciprocal implementation requires an inexpensive operand prescaler, one 1k lookup table with 10-bit entries, and a 5-bit adder, for a maximum relative error of the result of only 1.5% over the entire range of the operand. Hardware synthesis in a typical 0.18 /spl mu/m process technology has indicated that the hardware implementation requires only 1600 standard cells to achieve a latency of 2.5 ns.