A micropower dsp for sensor applications

Ultra-low power systems, such as wireless microsensor networks or implanted medical devices, are driving the development of processors capable of performing increasingly complicated computations using mere microwatts of power. This thesis describes the design of a micropower DSP intended for medium bandwidth microsensor applications (such as acoustic sensing and tracking) which achieves 4 MIPS performance at 40 μW (10 pJ per instruction) operating at 450 mV and fabricated in 90 nm CMOS. Energy efficiency optimizations include a custom CPU instruction set, a miniature instruction cache with a novel replacement strategy, hardware accelerator cores for FIR filter and FFT operations, and extensive power gating of both logic and memory. The tradeoffs of cache size, line length, and replacement policy for very small (a few hundred words or less) caches are explored, as are the design implications of optimizing the cache for minimum energy without regard to performance (since on-chip memory access is already single-cycle). A replacement policy designed to reduce thrashing in miniature instruction caches is presented. Efficient control of power-gated circuits requires consideration of the minimum off time, or break-even time. An energy model for determining the break-even time is developed, which correlates with measurements of the power-gated domains on the DSP. The energy savings obtained from hardware accelerators for FIR filtering and FFT operations are measured, and a model is developed to predict the actual net power reduction in a real system, including factors such as sampling rate, leakage power, latency requirements, and power gating overhead. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

[1]  Jason Liu,et al.  A High-Density Subthreshold SRAM with Data-Independent Bitline Leakage and Virtual Ground Replica Scheme , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[2]  Efficient Multiplication and Division Using MSP430 , 2006 .

[3]  Satoshi Shigematsu,et al.  A 1-V multithreshold-voltage CMOS digital signal processor for mobile phone application , 1996, IEEE J. Solid State Circuits.

[4]  Anantha Chandrakasan,et al.  JouleTrack: a web based tool for software energy profiling , 2001, DAC '01.

[5]  Douglas L. Jones,et al.  Real-valued fast Fourier transform algorithms , 1987, IEEE Trans. Acoust. Speech Signal Process..

[6]  Zhen Fang,et al.  A low-power accelerator for the SPHINX 3 speech recognition system , 2003, CASES '03.

[7]  Chenming Hu,et al.  MOSFET Modeling & BSIM3 User’s Guide , 1999 .

[8]  Gu-Yeon Wei,et al.  An ultra low power system architecture for sensor network applications , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[9]  K.S.J. Pister,et al.  An ultra-low energy microcontroller for Smart Dust wireless sensor networks , 2004, 2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519).

[10]  William H. Mangione-Smith,et al.  The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[11]  Srinivas Devadas,et al.  Software-assisted cache replacement mechanisms for embedded systems , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[12]  Rajit Manohar,et al.  SNAP: a Sensor-Network Asynchronous Processor , 2003, Ninth International Symposium on Asynchronous Circuits and Systems, 2003. Proceedings..

[13]  David Blaauw,et al.  A Sub-200mV 6T SRAM in 0.13μm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[14]  Luca Benini,et al.  Layout-driven memory synthesis for embedded systems-on-chip , 2002, IEEE Trans. Very Large Scale Integr. Syst..

[15]  Naveen Verma,et al.  A 65nm 8T Sub-Vt SRAM Employing Sense-Amplifier Redundancy , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[16]  Ghislain Despesse,et al.  Harvesting raindrop energy: experimental study , 2008 .

[17]  Frank Vahid,et al.  Tuning of loop cache architectures to programs in embedded system design , 2002, 15th International Symposium on System Synthesis, 2002..

[18]  I. Verbauwhede,et al.  Interfacing a high speed crypto accelerator to an embedded CPU , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[19]  Jens Palsberg,et al.  Avrora: scalable sensor network simulation with precise timing , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[20]  Chieh-Yih Wan,et al.  Intel Mote 2: an advanced platform for demanding sensor network applications , 2005, SenSys '05.

[21]  Rajit Manohar,et al.  An ultra low-power processor for sensor networks , 2004, ASPLOS XI.

[22]  Roger M. Needham,et al.  TEA, a Tiny Encryption Algorithm , 1994, FSE.

[23]  Jeffrey B. Rothman,et al.  Sector cache design and performance , 2000, Proceedings 8th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.PR00728).

[24]  Nathan Ickes,et al.  The Hardware and the Network: Total-System Strategies for Power Aware Wireless Microsensors , 2002 .

[25]  Shin'ichiro Mutoh,et al.  1-V power supply high-speed digital circuit technology with multithreshold-voltage CMOS , 1995, IEEE J. Solid State Circuits.

[26]  Anantha Chandrakasan,et al.  Low power scalable encryption for wireless systems , 1998, Wirel. Networks.

[27]  John Arends,et al.  Instruction fetch energy reduction using loop caches for embedded applications with small tight loops , 1999, ISLPED '99.

[28]  Hoi-Jun Yoo,et al.  A Low Power 16-bit RISC with Lossless Compression Accelerator for Body Sensor Network System , 2006, 2006 IEEE Asian Solid-State Circuits Conference.

[29]  Satoshi Shigematsu,et al.  A 1-V high-speed MTCMOS circuit scheme for power-down application circuits , 1997, IEEE J. Solid State Circuits.

[30]  Kaushik Roy,et al.  A 32kb 10T Subthreshold SRAM Array with Bit-Interleaving and Differential Read Scheme in 90nm CMOS , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[31]  Saied Hosseini-Khayat,et al.  On Optimal Replacement of Nonuniform Cache Objects , 2000, IEEE Trans. Computers.

[32]  Alice Wang An ultra-low voltage FFT processor using energy-aware techniques , 2003 .

[33]  Jaroslav Flidr,et al.  An integrated modular power-aware microsensor architecture and application to unattended acoustic vehicle tracking , 2005, SPIE Defense + Commercial Sensing.

[34]  Scott A. Mahlke,et al.  Compiler managed dynamic instruction placement in a low-power code cache , 2005, International Symposium on Code Generation and Optimization.

[35]  H. Mizuno,et al.  ChipOS: Open power-management platform to overcome the power crisis in future LSIs , 2001, 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177).

[36]  Scott A. Mahlke,et al.  Increasing the number of effective registers in a low-power processor using a windowed register file , 2003, CASES '03.

[37]  Lea Hwang Lee,et al.  Low-Cost Embedded Program Loop Caching - Revisited , 1999 .

[38]  Anantha Chandrakasan,et al.  Transistor sizing issues and tool for multi-threshold CMOS technology , 1997, DAC.

[39]  A. Chandrakasan,et al.  A 256kb Sub-threshold SRAM in 65nm CMOS , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[40]  Nathan Ickes,et al.  Energy-centric enabling tecumologies for wireless sensor networks , 2002, IEEE Wireless Communications.

[41]  David Blaauw,et al.  Energy optimization of subthreshold-voltage sensor network processors , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[42]  G. Cauwenberghs,et al.  Micropower mixed-signal acoustic localizer , 2003, ESSCIRC 2004 - 29th European Solid-State Circuits Conference (IEEE Cat. No.03EX705).

[43]  Frank Vahid,et al.  Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example , 2002, IEEE Computer Architecture Letters.

[44]  Mani B. Srivastava,et al.  Design considerations for solar energy harvesting wireless embedded systems , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[45]  MälardalenReal-TimeResearchCentre MälardalenUniversity Energy Characterization of a RTOS Hardware Accelerator for SoCs , 2002 .

[46]  Fei Li,et al.  FPGA power reduction using configurable dual-Vdd , 2004, Proceedings. 41st Design Automation Conference, 2004..

[47]  David E. Culler,et al.  A wireless embedded sensor architecture for system-level optimization , 2002 .

[48]  C. Van Hoof,et al.  Thermoelectric Converters of Human Warmth for Self-Powered Wireless Sensor Nodes , 2007, IEEE Sensors Journal.

[49]  Bo Zhai,et al.  A 2.60pJ/Inst Subthreshold Sensor Processor for Optimal Energy Efficiency , 2006, 2006 Symposium on VLSI Circuits, 2006. Digest of Technical Papers..

[50]  Naveen Verma,et al.  A 65nm Sub-Vt Microcontroller with Integrated SRAM and Switched-Capacitor DC-DC Converter , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[51]  David Blaauw,et al.  A second-generation sensor network processor with application-driven memory optimizations and out-of-order execution , 2005, CASES '05.

[52]  Brian Schott,et al.  Power-Aware Acoustic Processing , 2003, IPSN.

[53]  Gregory J. Pottie,et al.  Instrumenting the world with wireless sensor networks , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[54]  Mohamed I. Elmasry,et al.  Design and optimization of multithreshold CMOS (MTCMOS) circuits , 2003, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[55]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..