A survey of power and energy efficient techniques for high performance numerical linear algebra operations
暂无分享,去创建一个
Zizhong Chen | Li Tan | Omar Hussaini | Longxiang Chen | Shashank Kothapalli | Ryan Bissiri | Li Tan | Shashank Kothapalli | Longxiang Chen | O. Hussaini | Ryan Bissiri | Zizhong Chen
[1] Jaeyoung Choi,et al. Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines , 1994, Sci. Program..
[2] Mahadev Satyanarayanan,et al. PowerScope: a tool for profiling the energy usage of mobile applications , 1999, Proceedings WMCSA'99. Second IEEE Workshop on Mobile Computing Systems and Applications.
[3] Frank Bellosa,et al. The benefits of event: driven energy accounting in power-sensitive systems , 2000, ACM SIGOPS European Workshop.
[4] Viktor K. Prasanna,et al. Energy-Efficient Matrix Multiplication on FPGAs , 2002, FPL.
[5] David F. Heidel,et al. An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[6] Ragunathan Rajkumar,et al. Critical power slope: understanding the runtime effects of frequency scaling , 2002, ICS '02.
[7] Naehyuck Chang,et al. Energy-Monitoring Tool for Low-Power Embedded Programs , 2002, IEEE Des. Test Comput..
[8] Dragan Maksimovic,et al. Closed-loop adaptive voltage scaling controller for standard-cell ASICs , 2002, ISLPED '02.
[9] Viktor K. Prasanna,et al. Energy efficiency of FPGAs and programmable processors for matrix multiplication , 2002, 2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings..
[10] M. Martonosi,et al. Runtime power monitoring in high-end processors: methodology and empirical data , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..
[11] Viktor K. Prasanna,et al. Time and Energy Efficient Matrix Factorization Using FPGAs , 2003, FPL.
[12] Naehyuck Chang,et al. Memory-aware energy-optimal frequency assignment for dynamic supply voltage scaling , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).
[13] Viktor K. Prasanna,et al. A high-performance and energy-efficient architecture for floating-point based LU decomposition on FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[14] Viktor K. Prasanna,et al. Efficient Floating-point Based Block LU Decomposition on FPGAs , 2004, ERSA.
[15] Kevin Skadron,et al. Understanding the energy efficiency of simultaneous multithreading , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).
[16] Viktor K. Prasanna,et al. Energy- and time-efficient matrix multiplication on FPGAs , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[17] Wu-chun Feng,et al. A Power-Aware Run-Time System for High-Performance Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[18] Rong Ge,et al. Power and energy profiling of scientific applications on distributed systems , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[19] Xin Yuan,et al. Automatic generation and tuning of MPI collective communication routines , 2005, ICS '05.
[20] Rong Ge,et al. Performance-constrained Distributed DVS Scheduling for Scientific Applications on Power-aware Clusters , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[21] Mahmut T. Kandemir,et al. Reducing power with performance constraints for parallel sparse applications , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[22] Mahmut T. Kandemir,et al. Reducing energy consumption of parallel sparse matrix applications through integrated link/CPU voltage scaling , 2007, The Journal of Supercomputing.
[23] Wolf-Dietrich Weber,et al. Power provisioning for a warehouse-sized computer , 2007, ISCA '07.
[24] Manoj Sachdev,et al. Variation-Aware Adaptive Voltage Scaling System , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[25] Rajkumar Buyya,et al. Power Aware Scheduling of Bag-of-Tasks Applications with Deadline Constraints on DVS-enabled Clusters , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).
[26] William J. Kaiser,et al. The Energy Endoscope: Real-Time Detailed Energy Accounting for Wireless Sensor Nodes , 2007, 2008 International Conference on Information Processing in Sensor Networks (ipsn 2008).
[27] Boyana Norris,et al. A component infrastructure for performance and power modeling of parallel scientific applications , 2008, CBHPC '08.
[28] Steven Swanson,et al. Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications , 2009, ASPLOS.
[29] Amar Phanishayee,et al. FAWN: a fast array of wimpy nodes , 2009, SOSP '09.
[30] Lea,et al. The Linux Energy Attribution and Accounting Platform , 2009 .
[31] Zhengfan Xia,et al. Architecture of a low-power FPGA based on self-adaptive voltage control , 2009, 2009 International SoC Design Conference (ISOCC).
[32] Hyesoon Kim,et al. An integrated GPU power and performance model , 2010, ISCA.
[33] Dong Li,et al. PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications , 2010, IEEE Transactions on Parallel and Distributed Systems.
[34] Wu-chun Feng,et al. Statistical Power and Performance Modeling for Optimizing the Energy Efficiency of Scientific Computing , 2010, 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing.
[35] Jack Dongarra,et al. Distibuted Dense Numerical Linear Algebra Algorithms on Massively Parallel Architectures: DPLASMA , 2011 .
[36] Ulrich Meyer,et al. Energy-efficient sorting using solid state disks , 2010, International Conference on Green Computing.
[37] Wayne Luk,et al. Dynamic scheduling Monte-Carlo framework for multi-accelerator heterogeneous clusters , 2010, 2010 International Conference on Field-Programmable Technology.
[38] Alexander S. Szalay,et al. Low-power amdahl-balanced blades for data intensive computing , 2010, OPSR.
[39] Qian Zhu,et al. Power-Aware Consolidation of Scientific Workflows in Virtualized Environments , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[40] Rahul Khanna,et al. RAPL: Memory power estimation and capping , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).
[41] R. C. Whaley,et al. Automatically Tuned Linear Algebra Software (ATLAS) , 2011, Encyclopedia of Parallel Computing.
[42] Enrique S. Quintana-Ortí,et al. Improving power efficiency of dense linear algebra algorithms on multi-core processors via slack control , 2011, 2011 International Conference on High Performance Computing & Simulation.
[43] Robert A. van de Geijn,et al. A high-performance, low-power linear algebra core , 2011, ASAP 2011 - 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors.
[44] Shuaiwen Song,et al. An ISO-Energy-Efficient Approach to Scalable System Power-Performance Optimization , 2011, 2011 IEEE International Conference on Cluster Computing.
[45] Enrique S. Quintana-Ortí,et al. DVFS-control techniques for dense linear algebra operations on multi-core processors , 2012, Computer Science - Research and Development.
[46] Chris Fallin,et al. Memory power management via dynamic voltage/frequency scaling , 2011, ICAC '11.
[47] Daniel Hackenberg,et al. Simultaneous multithreading on x86_64 systems: an energy efficiency evaluation , 2011, HotPower '11.
[48] Qingyuan Deng,et al. MemScale: active low-power modes for main memory , 2011, ASPLOS XVI.
[49] Enrique S. Quintana-Ortí,et al. Optimization of power consumption in the iterative solution of sparse linear systems on graphics processors , 2011, Computer Science - Research and Development.
[50] Vincent Heuveline,et al. Analysis and optimization of power consumption in the iterative solution of sparse linear systems on multi-core and many-core platforms , 2011, 2011 International Green Computing Conference and Workshops.
[51] Shuaiwen Song,et al. Iso-Energy-Efficiency: An Approach to Power-Constrained Parallel Computation , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[52] Jack J. Dongarra,et al. Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiency , 2012, Computer Science - Research and Development.
[53] Robert A. van de Geijn,et al. Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures , 2012, IEEE Transactions on Computers.
[54] Jian Li,et al. Power-efficient time-sensitive mapping in heterogeneous systems , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).
[55] Rafael Mayo,et al. Binding Performance and Power of Dense Linear Algebra Operations , 2012, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.
[56] Rafael Mayo,et al. Leveraging Task-Parallelism in Energy-Efficient ILU Preconditioners , 2012, ICT-GLOW.
[57] Enrique S. Quintana-Ortí,et al. Saving Energy in the LU Factorization with Partial Pivoting on Multi-core Processors , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.
[58] Ananta Tiwari,et al. Modeling Power and Energy Usage of HPC Kernels , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.
[59] J. Demmel,et al. Instrumenting Linear Algebra Energy Consumption via On-chip Energy Counters , 2012 .
[60] Wayne Luk,et al. Heterogeneous Systems for Energy Efficient Scientific Computing , 2012, ARC.
[61] George Bosilca,et al. Power profiling of Cholesky and QR factorizations on distributed memory systems , 2012, Computer Science - Research and Development.
[62] Jack J. Dongarra,et al. Energy Footprint of Advanced Dense Numerical Linear Algebra Using Tile Algorithms on Multicore Architectures , 2012, 2012 Second International Conference on Cloud and Green Computing.
[63] Shirley Moore,et al. Measuring Energy and Power with PAPI , 2012, 2012 41st International Conference on Parallel Processing Workshops.
[64] Enrique S. Quintana-Ortí,et al. Reducing Energy Consumption of Dense Linear Algebra Operations on Hybrid CPU-GPU Platforms , 2012, 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications.
[65] Rafael Mayo,et al. Analysis of Strategies to Save Energy for Message-Passing Dense Linear Algebra Kernels , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.
[66] Enrique S. Quintana-Ortí,et al. On the Impact of Optimization on the Time-Power-Energy Balance of Dense Linear Algebra Factorizations , 2013, ICA3PP.
[67] Dong Li,et al. Improving performance and energy efficiency of matrix multiplication via pipeline broadcast , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).
[68] Gokcen Kestor,et al. Enabling accurate power profiling of HPC applications on exascale systems , 2013, ROSS '13.
[69] Guang R. Gao,et al. Optimizing the LU Factorization for Energy Efficiency on a Many-Core Architecture , 2013, LCPC.
[70] Enrique S. Quintana-Ortí,et al. Trading Off Performance for Power-Energy in Dense Linear Algebra Operations , 2013, HiPC 2013.
[71] Dong Li,et al. A2E: Adaptively aggressive energy efficient DVFS scheduling for data intensive applications , 2013, 2013 IEEE 32nd International Performance Computing and Communications Conference (IPCCC).
[72] Rong Ge,et al. Effects of Dynamic Voltage and Frequency Scaling on a K20 GPU , 2013, 2013 42nd International Conference on Parallel Processing.
[73] Jose Nunez-Yanez. Energy proportional computing in commercial FPGAs with adaptive voltage scaling , 2013 .
[74] Domingo Giménez,et al. Analytical Modeling of the Energy Consumption for the High Performance Linpack , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[75] Optimizing Energy Efficiency for Distributed Dense Matrix Factorizations via Utilizing Algorithmic Characteristics , 2014 .
[76] Dong Li,et al. HP-DAEMON: High Performance Distributed Adaptive Energy-efficient Matrix-multiplicatiON , 2014, ICCS.
[77] José Luis Núñez-Yáñez,et al. Adaptive Voltage Scaling with In-Situ Detectors in Commercial FPGAs , 2015, IEEE Transactions on Computers.