3D NoC-enabled heterogeneous manycore architectures for accelerating CNN training: Performance and thermal trade-offs
暂无分享,去创建一个
Radu Marculescu | Partha Pratim Pande | Diana Marculescu | Janardhan Rao Doppa | Wonje Choi | Ryan Gary Kim | Biresh Kumar Joardar | P. Pande | R. Marculescu | Diana Marculescu | R. Kim | Wonje Choi | J. Doppa
[1] Partha Pratim Pande,et al. Networks-on-Chip in a Three-Dimensional Environment: A Performance Evaluation , 2009, IEEE Transactions on Computers.
[2] Jun Yang,et al. Thermal Management for 3D Processors via Task Scheduling , 2008, 2008 37th International Conference on Parallel Processing.
[3] Ankur Jain,et al. Die/wafer stacking with reciprocal design symmetry (RDS) for mask reuse in three-dimensional (3D) integration technology , 2009, 2009 10th International Symposium on Quality Electronic Design.
[4] Arvind Kumar,et al. Three-dimensional integrated circuits , 2006, IBM J. Res. Dev..
[5] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[6] John Kim,et al. Throughput-Effective On-Chip Networks for Manycore Accelerators , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[7] Yuan Xie,et al. 3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis , 2009, 2009 IEEE International Conference on Computer Design.
[8] Mahmut T. Kandemir,et al. Design and Management of 3D Chip Multiprocessors Using Network-in-Memory , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).
[9] Jason Cong,et al. A thermal-driven floorplanning algorithm for 3D ICs , 2004, IEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004..
[10] Nam Sung Kim,et al. GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.
[11] Olav Lysne,et al. Layered routing in irregular networks , 2006, IEEE Transactions on Parallel and Distributed Systems.
[12] David A. Wood,et al. GPU Computing Pipeline Inefficiencies and Optimization Opportunities in Heterogeneous CPU-GPU Processors , 2015, 2015 IEEE International Symposium on Workload Characterization.
[13] Ujjwal Maulik,et al. A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA , 2008, IEEE Transactions on Evolutionary Computation.
[14] Martin Burtscher,et al. Bridging the processor-memory performance gap with 3D IC technology , 2005, IEEE Design & Test of Computers.
[15] David A. Wood,et al. gem5-gpu: A Heterogeneous CPU-GPU Simulator , 2015, IEEE Computer Architecture Letters.
[16] Jian Xu,et al. Demystifying 3D ICs: the pros and cons of going vertical , 2005, IEEE Design & Test of Computers.
[17] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[18] Sudhakar Yalamanchili,et al. Design space exploration of on-chip ring interconnection for a CPU-GPU heterogeneous architecture , 2013, J. Parallel Distributed Comput..
[19] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[20] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[21] David Atienza,et al. 3D-ICE: Fast compact transient thermal modeling for 3D ICs with inter-tier liquid cooling , 2010, 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[22] Radu Marculescu,et al. Hybrid network-on-chip architectures for accelerating deep learning kernels on heterogeneous manycore platforms , 2016, 2016 International Conference on Compliers, Architectures, and Sythesis of Embedded Systems (CASES).
[23] Mahmut T. Kandemir,et al. Managing GPU Concurrency in Heterogeneous Architectures , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[24] Jinchun Kim,et al. Bandwidth-efficient on-chip interconnect designs for GPGPUs , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[25] Ran Ginosar,et al. Network-on-Chip Architectures for Neural Networks , 2010, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip.
[26] Partha Pratim Pande,et al. Design-Space Exploration and Optimization of an Energy-Efficient and Reliable 3-D Small-World Network-on-Chip , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.