A 210MHz 5mW UnifiedVector and Transcendental Function UnitforHandield 3-D
A low-power, area-efficient 4-way32-bit unified vector and transcendental function unithasbeendeveloped for programmable shaders forhandheld 3-Dgraphics systems. It adopts thelogarithmic numbersystem(LNS)atthearithmetic coreforthesmall-size, low-power unification andsingle cycle throughput withmaximum4-cycle latency ofvarious vector and transcendental functions. A novellogarithmic conversion scheme isproposed with0.41%ofmaximumconversion error. A test chipisimplemented by0.18-,um CMOS technology with91K gates. Itoperates at210MHzandconsumes 15mW at1.8V. I.INTRODUCTION Moderngraphics processing units (GPUs) for3-Dgraphics systems adopt programmable processors, knownasshaders, in 3-Dgraphics pipeline stages toprovide various graphics effects (1)(2). Since theshaders require complex vector and transcendental functions, itisachallenging issue torealize these onthehandheld platform that hassmall footprints and limited battery life. Therehasbeenastudy onarea-efficient multi-function unitsdealing withvarious transcendental functions forhigh-end graphics systems (3). However, they didn't takeintoaccountthepowerconsumption and unification oftranscendental function withvector arithmetic operations. Inthispaper, we present a unified arithmetic unitfor mathematical operations usedin the 3-D graphics programmable shaders.This architecture unifies transcendental functions including trigonometric functions, powerofarbitrary exponent andlogarithm inanybasewith vector multiplication, division, square rootandinner product ina single arithmetic platform. Itadopts thelogarithmic number system (LNS)atthearithmetic coreforthesmall-size andlow-power unification, andachieves single cycle throughput foralloperations. ' ' ' ~~~~~~~~~~~~~~~~8-rgi-ns
[1] Stuart F. Oberman,et al. A high-performance area-efficient multifunction interpolator , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).