Low power engineering
暂无分享,去创建一个
[1] Rajesh K. Gupta,et al. Power savings in embedded processors through decode filter cache , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.
[2] Ibrahim N. Hajj,et al. Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).
[3] Chau-Wen Tseng,et al. Improving data locality with loop transformations , 1996, TOPL.
[4] Emilio L. Zapata,et al. Set Associative Cache Behavior Optimization , 1999, Euro-Par.
[5] François Bodin,et al. Accurate Data Distribution into Blocks may Boost Cache Performance , 1997 .
[6] Tajana Simunic,et al. Remote power control of wireless network interfaces , 2003, J. Embed. Comput..
[7] Yves Robert,et al. Loop nest scheduling and transformations , 1993 .
[8] Wei Li,et al. Unifying data and control transformations for distributed shared-memory machines , 1995, PLDI '95.
[9] Keshav Pingali,et al. A Singular Loop Transformation Framework Based on Non-Singular Matrices , 1992, LCPC.
[10] Duncan H. Lawrie,et al. On the Performance Enhancement of Paging Systems Through Program Analysis and Transformations , 1981, IEEE Transactions on Computers.
[11] H. De Man,et al. SynGuide: An environment for doing interactive correctness preserving transformations , 1993, Proceedings of IEEE Workshop on VLSI Signal Processing.
[12] Nikil D. Dutt,et al. System and architecture-level power reduction of microprocessor-based communication and multi-media applications , 2000, IEEE/ACM International Conference on Computer Aided Design. ICCAD - 2000. IEEE/ACM Digest of Technical Papers (Cat. No.00CH37140).
[13] Rudy Lauwereins,et al. Instruction buffering exploration for low energy VLIWs with instruction clusters , 2004 .
[14] Kazuaki Murakami,et al. A history-based I-cache for low-energy multimedia applications , 2002, ISLPED '02.
[15] E.H.L. Aarts,et al. Period assignment in multidimensional periodic scheduling , 1998, 1998 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (IEEE Cat. No.98CB36287).
[16] Frank Vahid,et al. Dynamic loop caching meets preloaded loop caching-a hybrid approach , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.
[17] Anantha Chandrakasan,et al. Algorithmic transforms for efficient energy scalable computation , 2000, ISLPED'00: Proceedings of the 2000 International Symposium on Low Power Electronics and Design (Cat. No.00TH8514).
[18] Luca Benini,et al. A survey of design techniques for system-level dynamic power management , 2000, IEEE Trans. Very Large Scale Integr. Syst..
[19] Albert van der Werf,et al. Mapping array communication onto FIFO communication - towards an implementation , 2000, ISSS '00.
[20] Aviral Shrivastava,et al. An efficient compiler technique for code size reduction using reduced bit-width ISAs , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.
[21] Johan A. Pouwelse,et al. Energy priority scheduling for variable voltage processors , 2001, ISLPED '01.
[22] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[23] Jef L. van Meerbergen,et al. Memory arbitration and cache management in stream-based systems , 2000, DATE '00.
[24] Frank Vahid,et al. Synthesis of customized loop caches for core-based embedded systems , 2002, ICCAD 2002.
[25] Rajendra Yavatkar,et al. A CPU Scheduling Algorithm for Continuous Media Applications , 1995, NOSSDAV.
[26] Preeti Ranjan Panda,et al. Memory bank customization and assignment in behavioral synthesis , 1999, 1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051).
[27] Frank Vahid,et al. Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example , 2002, IEEE Computer Architecture Letters.
[28] Nikil D. Dutt,et al. Memory aware compilation through accurate timing extraction , 2000, Proceedings 37th Design Automation Conference.
[29] Fredrik Dahlgren,et al. Exploration of the spatial locality on emerging applications and the consequences for cache performance , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[30] Jörg Henkel,et al. I-CoPES: fast instruction code placement for embedded systems to improve performance and energy efficiency , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).
[31] Paul Feautrier. Compiling for massively parallel architectures: a perspective , 1995, Microprocess. Microprogramming.
[32] Keshav Pingali,et al. Access normalization: loop restructuring for NUMA computers , 1993, TOCS.
[33] Tibor Gyimóthy,et al. Survey of code-size reduction methods , 2003, CSUR.
[34] Praveen K. Murthy,et al. A buffer merging technique for reducing memory requirements of synchronous dataflow specifications , 1999, Proceedings 12th International Symposium on System Synthesis.
[35] Edwin Hsing-Mean Sha,et al. Multi-dimensional interleaving for time-and-memory design optimization , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.
[36] Kathryn S. McKinley,et al. A Compiler Optimization Algorithm for Shared-Memory Multiprocessors , 1998, IEEE Trans. Parallel Distributed Syst..
[37] Donald E. Thomas,et al. The system architect's workbench , 1988, DAC '88.
[38] Rajesh K. Gupta,et al. Design of a predictive filter cache for energy savings in high performance processor architectures , 2001, Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001.
[39] Chi-Ying Tsui,et al. Low power motion estimation design using adaptive pixel truncation , 1997, ISLPED '97.
[40] Luca Benini,et al. Software-controlled processor speed setting for low-power streamingmultimedia , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[41] David A. Padua,et al. Advanced compiler optimizations for supercomputers , 1986, CACM.
[42] Weiyu Tang,et al. Reducing power with an L0 instruction cache using history-based prediction , 2002, International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems.
[43] Ken Kennedy,et al. Vector Register Allocation , 1992, IEEE Trans. Computers.
[44] William Jalby,et al. A strategy for array management in local memory , 1994, Math. Program..
[45] Jörg Henkel,et al. Code compression for low power embedded system design , 2000, Proceedings 37th Design Automation Conference.
[46] Rudolf Eigenmann,et al. Automatic program parallelization , 1993, Proc. IEEE.
[47] G. Albera,et al. Power/performance advantages of victim buffer in high-performance processors , 1999, Proceedings IEEE Alessandro Volta Memorial Workshop on Low-Power Design.
[48] Luca Benini,et al. Contents provider-assisted dynamic voltage scaling for low energy multimedia applications , 2002, ISLPED '02.
[49] Mani B. Srivastava,et al. Power-aware multimedia systems using run-time prediction , 2001, VLSI Design 2001. Fourteenth International Conference on VLSI Design.
[50] Keshab K. Parhi,et al. Algorithm transformation techniques for concurrent processors , 1989, Proc. IEEE.
[51] Hock-Beng Lim,et al. Efficient integration of compiler-directed cache coherence and data prefetching , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[52] Diederik Verkest,et al. Global multimedia system design exploration using accurate memory organization feedback , 1999, DAC '99.
[53] Patrice Quinton,et al. The Alpha du Centaur experiment , 1992 .
[54] Flavius Gruian,et al. Energy-Centric Scheduling for Real-Time Systems , 2002 .
[55] P. Feautrier. Compiling for Massively Parallel Architectures , 1995 .
[56] Hugo De Man,et al. A preprocessing step for global loop transformations for data transfer optimization , 2000, CASES '00.
[57] Wei-Chung Cheng,et al. Power-Aware Bus Encoding Techniques for I/O and Data Buses in an Embedded System , 2002, J. Circuits Syst. Comput..
[58] Hugo De Man,et al. Flow graph balancing for minimizing the required memory bandwidth , 1996, Proceedings of 9th International Symposium on Systems Synthesis.
[59] Chaitali Chakrabarti,et al. Memory exploration for low power, embedded systems , 1999, DAC '99.
[60] Luca Benini,et al. Cached-code compression for energy minimization in embedded processors , 2001, ISLPED '01.
[61] Keshav Pingali,et al. Access normalization: loop restructuring for NUMA compilers , 1992, ASPLOS V.
[62] Preeti Ranjan Panda,et al. Memory optimizations and exploration for embedded systems , 1998 .
[63] Mahmut T. Kandemir,et al. Reducing memory requirements of nested loops for embedded systems , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).
[64] Song Chen,et al. Synthesis of custom interleaved memory systems , 2000, IEEE Trans. Very Large Scale Integr. Syst..
[65] Anantha Chandrakasan,et al. A framework for energy-scalable communication in high-density wireless networks , 2002, ISLPED '02.
[66] Rudy Lauwereins,et al. Instruction buffering exploration for low energy embedded processors , 2005, J. Embed. Comput..
[67] Nikil D. Dutt,et al. MIST: an algorithm for memory miss traffic management , 2000, IEEE/ACM International Conference on Computer Aided Design. ICCAD - 2000. IEEE/ACM Digest of Technical Papers (Cat. No.00CH37140).
[68] Luca Benini,et al. Dynamic voltage scaling and power management for portable systems , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).
[69] Mahmut T. Kandemir,et al. A Holistic Approach to System Level Energy Optimization , 2000, PATMOS.
[70] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[71] Henk Corporaal,et al. A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors , 2002, PATMOS.
[72] Anantha Chandrakasan,et al. Energy scalable system design , 2002, IEEE Trans. Very Large Scale Integr. Syst..
[73] Guido Araujo,et al. Compressed code execution on DSP architectures , 1999, Proceedings 12th International Symposium on System Synthesis.
[74] Raminder Singh Bajwa,et al. Instruction buffering to reduce power in processors for signal processing , 1997, IEEE Trans. Very Large Scale Integr. Syst..
[75] Monica S. Lam,et al. Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..
[76] Nikil D. Dutt,et al. Data cache sizing for embedded processor applications , 1998, Proceedings Design, Automation and Test in Europe.
[77] Ken Kennedy,et al. The memory of bandwidth bottleneck and its amelioration by a compiler , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[78] Kaushik Roy,et al. Reducing set-associative cache energy via way-prediction and selective direct-mapping , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[79] Narayanan Vijaykrishnan,et al. Instruction scheduling based on energy and performance constraints , 2000, Proceedings IEEE Computer Society Workshop on VLSI 2000. System Design for a System-on-Chip Era.
[80] Anne Mignotte,et al. Loop alignment for memory accesses optimization , 1999, Proceedings 12th International Symposium on System Synthesis.
[81] Francky Catthoor. Energy-Delay Efficient Data Storage and Transfer Architectures and Methodologies: Current Solutions and Remaining Problems , 1999, J. VLSI Signal Process..
[82] Tajana Simunic,et al. A low-power, fixed-point, front-end feature extraction for a distributed speech recognition system , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[83] Anantha P. Chandrakasan,et al. Low-power CMOS digital design , 1992 .
[84] G. Venkatesh,et al. Extensions to programmable DSP architectures for reduced power dissipation , 1998, Proceedings Eleventh International Conference on VLSI Design.
[85] Laszlo A. Belady,et al. A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..
[86] Harry Berryman,et al. Multiprocessors and run-time compilation , 1991, Concurr. Pract. Exp..
[87] Chau-Wen Tseng,et al. An Overview of the SUIF Compiler for Scalable Parallel Machines , 1995, PPSC.
[88] Hiroto Yasuura,et al. A power reduction technique with object code merging for application specific embedded processors , 2000, DATE '00.
[89] Luca Benini,et al. Selective instruction compression for memory energy reduction in embedded systems , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).
[90] Wen-mei W. Hwu,et al. Enhancing loop buffering of media and telecommunications applications using low-overhead predication , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.
[91] Enric Morancho,et al. A Unified Transformation Technique for Multilevel Blocking , 1996, Euro-Par, Vol. I.
[92] Dongkun Shin,et al. An Operation Rearrangement Technique for Low-Power VLIW Instruction Fetch , 2000 .
[93] Nikil D. Dutt,et al. Local memory exploration and optimization in embedded systems , 1999, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[94] Henk L. Muller,et al. Predictable instruction caching for media processors , 2002, Proceedings IEEE International Conference on Application- Specific Systems, Architectures, and Processors.
[95] M. Liou,et al. Reducing hardware complexity of motion estimation algorithms using truncated pixels , 1997, Proceedings of 1997 IEEE International Symposium on Circuits and Systems. Circuits and Systems in the Information Age ISCAS '97.
[96] John A. Chandy,et al. The Paradigm Compiler for Distributed-Memory Multicomputers , 1995, Computer.
[97] Mi Lu,et al. An Iteration Partition Approach for Cache or Local Memory Thrashing on Parallel Processing , 1991, IEEE Trans. Computers.
[98] Naehyuck Chang,et al. Low-power color TFT LCD display for hand-held embedded systems , 2002, ISLPED '02.
[99] Mahmut T. Kandemir,et al. Power-aware partitioned cache architectures , 2001, ISLPED '01.
[100] Edwin Hsing-Mean Sha,et al. Full Parallelism in Uniform Nested Loops Using Multi-Dimensional Retiming , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.
[101] Vittorio Zaccaria,et al. An instruction-level methodology for power estimation and optimization of embedded VLIW cores , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.
[102] L. Benini,et al. A Power Modeling and Estimation Framework for VLIW-based Embedded Systems , 2001 .
[103] Kemal Ebcioglu,et al. A study on the number of memory ports in multiple instruction issue machines , 1993, MICRO 1993.
[104] Bjorn De Sutter,et al. Compiler techniques for code compaction , 2000, TOPL.
[105] Edward W. Davis,et al. A Software Approach to Avoiding Spatial Cache Collisions in Parallel Processor Systems , 1998, IEEE Trans. Parallel Distributed Syst..
[106] William Jalby,et al. A Quantitative Algorithm for Data Locality Optimization , 1991, Code Generation.
[107] Kanad Ghose,et al. Analytical energy dissipation models for low-power caches , 1997, ISLPED '97.
[108] William H. Mangione-Smith,et al. Filtering Memory References to Increase Energy Efficiency , 2000, IEEE Trans. Computers.
[109] Thijs Krol,et al. A transformational approach to VHDL and CDFG based high-level synthesis: a case study , 1995, Proceedings of the IEEE 1995 Custom Integrated Circuits Conference.
[110] N.K. Jha,et al. Removal of memory access bottlenecks for scheduling control-flow intensive behavioral descriptions , 1998, 1998 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (IEEE Cat. No.98CB36287).
[111] Miodrag Potkonjak,et al. Energy minimization with guaranteed quality of service , 2000, ISLPED '00.
[112] David H. Albonesi,et al. Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[113] Lama H. Chandrasena,et al. A comprehensive analysis of energy savings in dynamic supply voltage scaling systems using data dependent voltage level selection , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[114] Luca Benini,et al. Low Power Control Techniques For TFT LCD Displays , 2002, CASES '02.
[115] Klara Nahrstedt,et al. R-EDF: a reservation-based EDF scheduling algorithm for multiple multimedia task classes , 2001, Proceedings Seventh IEEE Real-Time Technology and Applications Symposium.
[116] Peter Marwedel,et al. Assigning program and data objects to scratchpad for energy reduction , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.
[117] Luca Benini,et al. System-level power optimization: techniques and tools , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).
[118] Hugo De Man,et al. Minimizing the required memory bandwidth in VLSI system realizations , 1999, IEEE Trans. Very Large Scale Integr. Syst..
[119] Edward A. Lee,et al. Optimal parenthesization of lexical orderings for DSP block diagrams , 1995, VLSI Signal Processing, VIII.
[120] Francky Catthoor,et al. Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design , 1998 .
[121] Dirk Grunwald,et al. Predictive sequential associative cache , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.
[122] Nikil D. Dutt,et al. Minimization of Memory Traffic in High-Level Synthesis , 1994, 31st Design Automation Conference.
[123] Erik Brockmeyer,et al. Storage Management Programmable Process , 2002 .
[124] Daniel C. McCrackin. Eliminating Interlocks in Deeply Pipelined Processors by Delay Enforced Multistreaming , 1991, IEEE Trans. Computers.
[125] David B. Loveman,et al. Program Improvement by Source-to-Source Transformation , 1977, J. ACM.
[126] Lionel M. Ni,et al. Dependence Uniformization: A Loop Parallelization Technique , 1993, IEEE Trans. Parallel Distributed Syst..
[127] Weijia Shang,et al. Generalized cycle shrinking , 1991, Algorithms and Parallel VLSI Architectures.
[128] Norman P. Jouppi,et al. Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.
[129] Frank Vahid,et al. Tuning of loop cache architectures to programs in embedded system design , 2002, 15th International Symposium on System Synthesis, 2002..
[130] Klara Nahrstedt,et al. A middleware framework coordinating processor/power resource management for multimedia applications , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).
[131] Alexandru Nicolau,et al. Loop Quantization: A Generalized Loop Unwinding Technique , 1988, J. Parallel Distributed Comput..
[132] Margaret Martonosi,et al. Characterizing the Memory Behavior of Compiler-Parallelized Applications , 1996, IEEE Trans. Parallel Distributed Syst..
[133] Dimitrios Soudris,et al. A code transformation-based methodology for improving I-cache performance of DSP applications , 2002, Proceedings 2002 Design, Automation and Test in Europe Conference and Exhibition.
[134] William Pugh,et al. Generating schedules and code within a unified reordering transformation framework , 1992 .
[135] Ken Kennedy,et al. The parascope editor: an interactive parallel programming tool , 1993, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[136] Mahmut T. Kandemir,et al. Partitioned instruction cache architecture for energy efficiency , 2003, TECS.
[137] Zhigang Chen,et al. On Uniformization of Affine Dependence Algorithms , 1996, IEEE Trans. Computers.
[138] Yike Guo,et al. Parallelizing Conditional Recurrences , 1996, Euro-Par, Vol. I.
[139] Luca Benini,et al. Dynamic frequency scaling with buffer insertion for mixed workloads , 2002, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..
[140] Sumedh W. Sathaye,et al. Instruction fetch mechanisms for VLIW architectures with compressed encodings , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[141] Lothar Thiele,et al. On the design of piecewise regular processor arrays , 1989, IEEE International Symposium on Circuits and Systems,.
[142] Hiroshi Nakamura,et al. Augmenting Loop Tiling with Data Alignment for Improved Cache Performance , 1999, IEEE Trans. Computers.
[143] Corinne Ancourt,et al. Automatic data mapping of signal processing applications , 1997, Proceedings IEEE International Conference on Application-Specific Systems, Architectures and Processors.