ALPHA: A Learning-Enabled High-Performance Network-on-Chip Router Design for Heterogeneous Manycore Architectures
暂无分享,去创建一个
Ahmed Louri | Yuan Li | A. Louri | Yuan Li
[1] William J. Dally,et al. GOAL: a load-balanced adaptive routing algorithm for torus networks , 2003, ISCA '03.
[2] Nick McKeown,et al. The iSLIP scheduling algorithm for input-queued switches , 1999, TNET.
[3] Pedro López,et al. A family of mechanisms for congestion control in wormhole networks , 2005, IEEE Transactions on Parallel and Distributed Systems.
[4] Yuan Xie,et al. Packet Pump: Overcoming Network Bottleneck in On-Chip Interconnects for GPGPUs* , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[5] Maurizio Palesi,et al. ProNoC: A low latency network-on-chip based many-core system-on-chip prototyping platform , 2017, Microprocess. Microsystems.
[6] Xiaola Lin,et al. The Repetitive Turn Model for Adaptive Routing , 2017, IEEE Transactions on Computers.
[7] Jinchun Kim,et al. Bandwidth-efficient on-chip interconnect designs for GPGPUs , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[8] Simon W. Moore,et al. Low-latency virtual-channel routers for on-chip networks , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..
[9] Natalie D. Enright Jerger,et al. On-Chip Networks , 2009, On-Chip Networks.
[10] S. Hyakin,et al. Neural Networks: A Comprehensive Foundation , 1994 .
[11] Ahmed Louri,et al. Extending the Power-Efficiency and Performance of Photonic Interconnects for Heterogeneous Multicores with Machine Learning , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[12] John Kim,et al. Providing cost-effective on-chip network bandwidth in GPGPUs , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).
[13] Chita R. Das,et al. Aérgia: exploiting packet latency slack in on-chip networks , 2010, ISCA.
[14] Timothy Mark Pinkston,et al. Communication-Aware Globally-Coordinated On-Chip Networks , 2012, IEEE Transactions on Parallel and Distributed Systems.
[15] Scott B. Baden,et al. Redefining the Role of the CPU in the Era of CPU-GPU Integration , 2012, IEEE Micro.
[16] Ahmed Louri,et al. Dynamic Voltage and Frequency Scaling in NoCs with Supervised and Reinforcement Learning Techniques , 2019, IEEE Transactions on Computers.
[17] Radu Marculescu,et al. SVR-NoC: A performance analysis tool for Network-on-Chips using learning-based support vector regression model , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[18] Chung-Ta King,et al. TS-Router: On maximizing the Quality-of-Allocation in the On-Chip Network , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[19] Chen Sun,et al. DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.
[20] Jeffrey S. Vetter,et al. A Survey of CPU-GPU Heterogeneous Computing Techniques , 2015, ACM Comput. Surv..
[21] Niraj K. Jha,et al. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS , 2007, ICCD.
[22] David Blaauw,et al. VIX: Virtual Input Crossbar for efficient switch allocation , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[23] Yuankun Xue,et al. User Cooperation Network Coding Approach for NoC Performance Improvement , 2015, NOCS.
[24] David A. Wood,et al. gem5-gpu: A Heterogeneous CPU-GPU Simulator , 2015, IEEE Computer Architecture Letters.
[25] Lionel M. Ni,et al. The turn model for adaptive routing , 1998, ISCA '98.
[26] Hamid Sarbazi-Azad,et al. BiNoCHS: Bimodal network-on-chip for CPU-GPU heterogeneous systems , 2017, 2017 Eleventh IEEE/ACM International Symposium on Networks-on-Chip (NOCS).
[27] Srinivasan Seshan,et al. On-chip networks from a networking perspective: congestion and scalability in many-core interconnects , 2012, SIGCOMM '12.
[28] Chita R. Das,et al. OSCAR: Orchestrating STT-RAM cache traffic for heterogeneous CPU-GPU architectures , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[29] Olivier Temam,et al. Reconciling specialization and flexibility through compound circuits , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[30] William J. Dally,et al. Flattened butterfly: a cost-efficient topology for high-radix networks , 2007, ISCA '07.
[31] José Duato,et al. Adaptive bubble router: a design to improve performance in torus networks , 1999, Proceedings of the 1999 International Conference on Parallel Processing.
[32] Ahmed Louri,et al. Machine learning enabled power-aware Network-on-Chip design , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[33] Chris Fallin,et al. Next generation on-chip networks: what kind of congestion control do we need? , 2010, Hotnets-IX.
[34] William J. Dally,et al. Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.
[35] Steven Swanson,et al. Conservation cores: reducing the energy of mature computations , 2010, ASPLOS XV.
[36] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[37] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[38] Luca Benini,et al. A multi-path routing strategy with guaranteed in-order packet delivery and fault-tolerance for networks on chip , 2006, 2006 43rd ACM/IEEE Design Automation Conference.
[39] John Kim,et al. Throughput-Effective On-Chip Networks for Manycore Accelerators , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[40] Avinash Kodi,et al. LEAD: Learning-enabled Energy-Aware Dynamic Voltage/frequency scaling in NoCs , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[41] Jason Cong,et al. On-chip interconnection network for accelerator-rich architectures , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[42] Ahmed Louri,et al. EZ-Pass: An Energy & Performance-Efficient Power-Gating Router Architecture for Scalable NoCs , 2018, IEEE Computer Architecture Letters.
[43] Kyung Hoon Kim,et al. Packet coalescing exploiting data redundancy in GPGPU architectures , 2017, ICS.
[44] David A. Wood,et al. Heterogeneous system coherence for integrated CPU-GPU systems , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[45] Stephen W. Keckler,et al. Regional congestion awareness for load balance in networks-on-chip , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.
[46] Sudhakar Yalamanchili,et al. Adaptive virtual channel partitioning for network-on-chip in heterogeneous architectures , 2013, ACM Trans. Design Autom. Electr. Syst..
[47] Shahin Nazarian,et al. Self-Optimizing and Self-Programming Computing Systems: A Combined Compiler, Complex Networks, and Machine Learning Approach , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[48] Chita R. Das,et al. A low latency router supporting adaptivity for on-chip interconnects , 2005, Proceedings. 42nd Design Automation Conference, 2005..
[49] Niraj K. Jha,et al. Express virtual channels: towards the ideal interconnection fabric , 2007, ISCA '07.
[50] Mahmut T. Kandemir,et al. Managing GPU Concurrency in Heterogeneous Architectures , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[51] Yuval Tamir,et al. Symmetric Crossbar Arbiters for VLSI Communication Switches , 1993, IEEE Trans. Parallel Distributed Syst..
[52] Ahmed Louri,et al. Dynamic error mitigation in NoCs using intelligent prediction techniques , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[53] Ahmed Louri,et al. A Versatile and Flexible Chiplet-based System Design for Heterogeneous Manycore Architectures , 2020, 2020 57th ACM/IEEE Design Automation Conference (DAC).
[54] Pedro López,et al. A congestion control mechanism for wormhole networks , 2001, Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing.
[55] Chita R. Das,et al. Application-aware prioritization mechanisms for on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[56] Radu Marculescu,et al. A traffic-aware adaptive routing algorithm on a highly reconfigurable network-on-chip architecture , 2012, CODES+ISSS.
[57] Kevin Skadron,et al. A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).
[58] Mark Handley,et al. Congestion control for high bandwidth-delay product networks , 2002, SIGCOMM '02.
[59] Natalie D. Enright Jerger,et al. Achieving predictable performance through better memory controller placement in many-core CMPs , 2009, ISCA '09.
[60] Kevin Kai-Wei Chang,et al. HAT: Heterogeneous Adaptive Throttling for On-Chip Networks , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.
[61] Ahmed Louri,et al. High-performance, Energy-efficient, Fault-tolerant Network-on-Chip Design Using Reinforcement Learnin , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[62] Ahmed Louri,et al. IntelliNoC: A Holistic Design Framework for Energy-Efficient and Reliable On-Chip Communication for Manycores , 2019, 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA).
[63] Yuan Yao,et al. Opportunistic Competition Overhead Reduction for Expediting Critical Section in NoC Based CMPs , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[64] William J. Dally,et al. Allocator implementations for network-on-chip routers , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[65] S. Lennart Johnsson,et al. ROMM routing on mesh and torus networks , 1995, SPAA '95.
[66] Ahmed Louri,et al. An Approximate Communication Framework for Network-on-Chips , 2020, IEEE Transactions on Parallel and Distributed Systems.
[67] Ahmed Louri,et al. An Energy-Efficient Network-on-Chip Design using Reinforcement Learning , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[68] William J. Dally,et al. Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels , 1993, IEEE Trans. Parallel Distributed Syst..
[69] William J. Dally,et al. A delay model and speculative architecture for pipelined routers , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.
[70] José Duato,et al. A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..
[71] John Kim,et al. Footprint: Regulating routing adaptiveness in Networks-on-Chip , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[72] William J. Dally,et al. Microarchitecture of a high radix router , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[73] Akif Ali,et al. Near-optimal worst-case throughput routing for two-dimensional mesh networks , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).
[74] Nan Jiang,et al. Packet chaining: Efficient single-cycle allocation for on-chip networks , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[75] Yoon Seok Yang,et al. SDPR: Improving Latency and Bandwidth in On-Chip Interconnect Through Simultaneous Dual-Path Routing , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[76] Xi Chen,et al. Up by their bootstraps: Online learning in Artificial Neural Networks for CMP uncore power management , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[77] Chita R. Das,et al. A heterogeneous multiple network-on-chip design: An application-aware approach , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[78] James C. Hoe,et al. Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs? , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.
[79] Sally Floyd,et al. TCP and explicit congestion notification , 1994, CCRV.