Alleviating Scalability Limitation of Accelerator-Based Platforms
暂无分享,去创建一个
[1] Gu-Yeon Wei,et al. The Aladdin Approach to Accelerator Design and Modeling , 2015, IEEE Micro.
[2] William Thies,et al. A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[3] Jason Cong,et al. AXR-CMP : Architecture Support in Accelerator-Rich CMPs , 2011 .
[4] Gu-Yeon Wei,et al. The Accelerator Store framework for high-performance, low-power accelerator-based systems , 2010, IEEE Computer Architecture Letters.
[5] David A. Wood,et al. LogCA: A Performance Model for Hardware Accelerators , 2015, IEEE Computer Architecture Letters.
[6] Jason Cong,et al. CHARM: a composable heterogeneous accelerator-rich microprocessor , 2012, ISLPED '12.
[7] Omesh Tickoo,et al. HiPPAI: High Performance Portable Accelerator Interface for SoCs , 2009, 2009 International Conference on High Performance Computing (HiPC).
[8] Ran Ginosar,et al. Generalized MultiAmdahl: Optimization of Heterogeneous Multi-Accelerator SoC , 2014, IEEE Computer Architecture Letters.
[9] Luca Benini,et al. Optimizing memory bandwidth exploitation for OpenVX applications on embedded many-core accelerators , 2015, Journal of Real-Time Image Processing.
[10] Steven Swanson,et al. Conservation cores: reducing the energy of mature computations , 2010, ASPLOS XV.
[11] Andreas Gerstlauer,et al. Heterogeneous multiprocessor mapping for real-time streaming systems , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Andreas Gerstlauer,et al. System-on-Chip Environment: A SpecC-Based Framework for Heterogeneous MPSoC Design , 2008, EURASIP J. Embed. Syst..
[13] Jason Cong,et al. Architecture support for accelerator-rich CMPs , 2012, DAC Design Automation Conference 2012.
[14] William J. Dally,et al. GPUs and the Future of Parallel Computing , 2011, IEEE Micro.
[15] Luca P. Carloni,et al. Accelerator Memory Reuse in the Dark Silicon Era , 2014, IEEE Computer Architecture Letters.
[16] Jason Cong,et al. BiN: a buffer-in-NUCA scheme for accelerator-rich CMPs , 2012, ISLPED '12.
[17] Gu-Yeon Wei,et al. Aladdin: A pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[18] Michael I. Gordon,et al. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs , 2006, ASPLOS XII.
[19] Luca Benini,et al. Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications , 2012, DAC Design Automation Conference 2012.
[20] David Wentzlaff,et al. Processor: A 64-Core SoC with Mesh Interconnect , 2010 .
[21] Gaurav Agarwal,et al. “Get smart” with TI’s embedded analytics technology , 2012 .
[22] Babak Falsafi,et al. Meet the walkers accelerating index traversals for in-memory databases , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[23] Jaehyuk Huh,et al. Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture , 2003, IEEE Micro.
[24] Christoforos E. Kozyrakis,et al. Convolution engine: balancing efficiency & flexibility in specialized computing , 2013, ISCA.
[25] Gunar Schirner,et al. Revisiting accelerator-rich CMPs: Challenges and solutions , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[26] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[27] Jason Cong,et al. Accelerator-rich architectures: Opportunities and progresses , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[28] Jason Cong,et al. Composable accelerator-rich microprocessor enhanced for adaptivity and longevity , 2013, International Symposium on Low Power Electronics and Design (ISLPED).
[29] Gunar Schirner,et al. Function-Level Processor (FLP): A High Performance, Minimal Bandwidth, Low Power Architecture for Market-Oriented MPSoCs , 2014, IEEE Embedded Systems Letters.
[30] P. Pham-Quoc Cuong. Hybrid Interconnect Design for Heterogeneous Hardware Accelerators , 2015 .
[31] Kari Pulli,et al. OpenVX: a framework for accelerating computer vision , 2016, SIGGRAPH ASIA Courses.
[32] Gunar Schirner,et al. Flexible function-level acceleration of embedded vision applications using the Pipelined Vision Processor , 2013, 2013 Asilomar Conference on Signals, Systems and Computers.
[33] Ben H. H. Juurlink,et al. The SARC Architecture , 2010, IEEE Micro.
[34] Christoforos E. Kozyrakis,et al. Convolution engine , 2015, Commun. ACM.
[35] David B. Thomas,et al. Transparent linking of compiled software and synthesized hardware , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[36] Steven Swanson,et al. QSCORES: Trading dark silicon for scalable energy efficiency with quasi-specific cores , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[37] Mark Hempstead,et al. Metrics for Early-Stage Modeling of Many-Accelerator Architectures , 2013, IEEE Computer Architecture Letters.
[38] Henk Corporaal,et al. The neuro vector engine: Flexibility to improve convolutional net efficiency for wearable vision , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[39] Ute Hoffmann. System Design A Practical Guide With Specc , 2016 .
[40] Henry Hoffmann,et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.
[41] Patrick Schaumont,et al. Data Flow Modeling and Transformation , 2013 .
[42] Henk Corporaal,et al. Analyzing synchronous dataflow scenarios for dynamic software-defined radio applications , 2011, 2011 International Symposium on System on Chip (SoC).
[43] Luca Benini,et al. He-P2012: Performance and Energy Exploration of Architecturally Heterogeneous Many-Cores , 2016, J. Signal Process. Syst..
[44] Karthikeyan Sankaralingam,et al. Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.
[45] Alberto L. Sangiovanni-Vincentelli,et al. Theory of latency-insensitive design , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..