Customized pipeline and instruction set architecture for embedded processing engines
暂无分享,去创建一个
[1] William J. Dally,et al. The GPU Computing Era , 2010, IEEE Micro.
[2] T. N. Vijaykumar,et al. Reducing register ports for higher speed and lower energy , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..
[3] Yifan He,et al. Energy efficient special instruction support in an embedded processor with compact isa , 2012, CASES '12.
[4] Victor V. Zyuban,et al. The energy complexity of register files , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).
[5] Hoi-Jun Yoo,et al. A 345 mW Heterogeneous Many-Core Processor With an Intelligent Inference Engine for Robust Object Recognition , 2011, IEEE Journal of Solid-State Circuits.
[6] Sied Mehdi Fakhraie,et al. Instruction set architectural guidelines for embedded packet-processing engines , 2012, J. Syst. Archit..
[7] Paolo Bonzini,et al. Recurrence-Aware Instruction Set Selection for Extensible Embedded Processors , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[8] Samuel Naffziger,et al. An x86-64 core implemented in 32nm SOI CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).
[9] Mark D. Hill,et al. Amdahl's Law in the Multicore Era , 2008, Computer.
[10] Paolo Faraboschi,et al. Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools , 2004 .
[11] Mark Horowitz,et al. Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis , 2010, ISCA.
[12] 장훈,et al. [서평]「Computer Organization and Design, The Hardware/Software Interface」 , 1997 .
[13] Paolo Ienne,et al. Exploiting pipelining to relax register-file port constraints of instruction-set extensions , 2005, CASES '05.
[14] Jason Cong,et al. Instruction set extension with shadow registers for configurable processors , 2005, FPGA '05.
[15] Stijn Eyerman,et al. Modeling critical sections in Amdahl's law and its implications for multicore design , 2010, ISCA '10.
[16] Ha Pham,et al. A 40nm 16-core 128-thread CMT SPARC SoC processor , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).
[17] Hamid Noori,et al. Energy-aware design space exploration of registerfile for extensible processors , 2010, 2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.
[18] Koichi Yamazaki,et al. A note on greedy algorithms for the maximum weighted independent set problem , 2003, Discret. Appl. Math..
[19] Kingshuk Karuri,et al. Increasing data-bandwidth to instruction-set extensions through register clustering , 2007, 2007 IEEE/ACM International Conference on Computer-Aided Design.
[20] Zhiyi Yu,et al. A 167-Processor Computational Platform in 65 nm CMOS , 2009, IEEE Journal of Solid-State Circuits.
[21] Di Wu,et al. Resource-shared custom instruction generation under performance/area constraints , 2012, 2012 International Symposium on System on Chip (SoC).
[22] Nikil D. Dutt,et al. Introduction of local memory elements in instruction set extensions , 2004, Proceedings. 41st Design Automation Conference, 2004..
[23] T. N. Vijaykumar,et al. Reducing register ports for higher speed and lower energy , 2002, MICRO.
[24] Sied Mehdi Fakhraie,et al. Architecture-Aware Graph-Covering Algorithm for Custom Instruction Selection , 2010, 2010 5th International Conference on Future Information Technology.
[25] Aviral Shrivastava,et al. Register File Power Reduction Using Bypass Sensitive Compiler , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[26] Wayne Luk,et al. Optimizing Instruction-set Extensible Processors under Data Bandwidth Constraints , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.
[27] Cid C. de Souza,et al. Efficient datapath merging for partially reconfigurable architectures , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[28] Chen-Yong Cher,et al. A wire-speed powerTM processor: 2.3GHz 45nm SOI with 16 cores and 64 threads , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).
[29] Preeti Ranjan Panda,et al. Customization of Register File Banking Architecture for Low Power , 2007, 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07).
[30] Tulika Mitra,et al. Scalable custom instructions identification for instruction-set extensible processors , 2004, CASES '04.
[31] Douglas L. Maskell,et al. Fast Identification of Custom Instructions for Extensible Processors , 2007, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[32] Sied Mehdi Fakhraie,et al. Quantitative analysis of packet-processing applications regarding architectural guidelines for network-processing-engine development , 2009, J. Syst. Archit..
[33] Nigel P. Topham,et al. Design-Space Exploration of Resource-Sharing Solutions for Custom Instruction Set Extensions , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[34] David Wentzlaff,et al. Processor: A 64-Core SoC with Mesh Interconnect , 2010 .
[35] Paolo Ienne,et al. Exact and approximate algorithms for the extension of embedded processor instruction sets , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[36] David A. Patterson,et al. Computer Organization and Design, Fourth Edition, Fourth Edition: The Hardware/Software Interface (The Morgan Kaufmann Series in Computer Architecture and Design) , 2008 .
[37] Kingshuk Karuri,et al. Increasing data-bandwidth to instruction-set extensions through register clustering , 2007, ICCAD 2007.
[38] Timothy Mattson,et al. A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).
[39] Haibin Liu,et al. Exploiting forwarding to improve data bandwidth of instruction-set extensions , 2006, 2006 43rd ACM/IEEE Design Automation Conference.
[40] Majid Sarrafzadeh,et al. Area-efficient instruction set synthesis for reconfigurable system-on-chip designs , 2004, Proceedings. 41st Design Automation Conference, 2004..
[41] Douglas L. Maskell,et al. Supporting multiple-input, multiple-output custom functions in configurable processors , 2007, J. Syst. Archit..
[42] Tilman Wolf,et al. Analysis of Network Processing Workloads , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..
[43] Scott A. Mahlke,et al. Processor Acceleration Through Automated Instruction Set Customization , 2003, MICRO.
[44] Ricardo E. Gonzalez,et al. Xtensa: A Configurable and Extensible Processor , 2000, IEEE Micro.
[45] Kingshuk Karuri,et al. A design flow for configurable embedded processors based on optimized instruction set extension synthesis , 2006, Proceedings of the Design Automation & Test in Europe Conference.
[46] Coniferous softwood. GENERAL TERMS , 2003 .
[47] Trevor N. Mudge,et al. Reducing register ports using delayed write-back queues and operand pre-fetch , 2003, ICS '03.
[48] Sied Mehdi Fakhraie,et al. Locality considerations in exploring custom instruction selection algorithms , 2010, 2nd Asia Symposium on Quality Electronic Design (ASQED).
[49] Paolo Ienne,et al. Automatic application-specific instruction-set extensions under microarchitectural constraints , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).
[50] Steven Swanson,et al. Area-Performance Trade-offs in Tiled Dataflow Architectures , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[51] Martin D. F. Wong,et al. Efficient ASIP design for configurable processors with fine-grained resource sharing , 2008, FPGA '08.
[52] Nachiket Kapre,et al. Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors , 2009, 2009 International Conference on Field Programmable Logic and Applications.
[53] Paolo Ienne,et al. Fast, Nearly Optimal ISE Identification With I/O Serialization Through Maximal Clique Enumeration , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[54] Wayne Luk,et al. CHIPS: Custom Hardware Instruction Processor Synthesis , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.