Language and compiler support for auto-tuning variable-accuracy algorithms
暂无分享,去创建一个
[1] James Demmel,et al. Applied Numerical Linear Algebra , 1997 .
[2] Henry Hoffmann,et al. Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.
[3] I-Hsin Chung,et al. Active Harmony: Towards Automated Performance Tuning , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[4] Samuel Williams,et al. Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures , 2008 .
[5] S. Lennart Johnsson,et al. Scheduling FFT computation on SMP and multicore systems , 2007, ICS '07.
[6] Christoph W. Kessler,et al. A Framework for Performance-Aware Composition of Explicitly Parallel Components , 2007, PARCO.
[7] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[8] Martin C. Rinard. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks , 2006, ICS '06.
[9] James Demmel,et al. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.
[10] Martin C. Rinard. Using early phase termination to eliminate load imbalances at barrier synchronization points , 2007, OOPSLA.
[11] Eric A. Brewer,et al. High-level optimization via automated statistical modeling , 1995, PPOPP '95.
[12] Woongki Baek,et al. Green: A System for Supporting Energy-Conscious Programming using Principled Approximation , 2009 .
[13] David S. Johnson,et al. A 71/60 theorem for bin packing , 1985, J. Complex..
[14] Woongki Baek,et al. Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.
[15] Katherine A. Yelick,et al. Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY , 2001, International Conference on Computational Science.
[16] Alan Edelman,et al. PetaBricks: a language and compiler for algorithmic choice , 2009, PLDI '09.
[17] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[18] Franz Franchetti,et al. Operator Language: A Program Generation Framework for Fast Kernels , 2009, DSL.
[19] Chun Chen,et al. A scalable auto-tuning framework for compiler optimization , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[20] William L. Briggs,et al. A multigrid tutorial , 1987 .
[21] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[22] William L. Briggs,et al. A multigrid tutorial, Second Edition , 2000 .
[23] Thomas E. Hull,et al. Specifications for a variable-precision arithmetic coprocessor , 1991, [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic.
[24] H. Yu,et al. An adaptive algorithm selection framework , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[25] Jack Dongarra,et al. Special Issue on Program Generation, Optimization, and Platform Adaptation , 2005, Proc. IEEE.
[26] Franz Franchetti,et al. SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.
[27] Jesper Andersson,et al. Profile-Guided Composition , 2008, SC@ETAPS.
[28] Martin Rinard,et al. Power-Aware Computing with Dynamic Knobs , 2010 .
[29] Michail G. Lagoudakis,et al. Algorithm Selection using Reinforcement Learning , 2000, ICML.
[30] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.
[31] Wenceslas Fernandez de la Vega,et al. Bin packing can be solved within 1+epsilon in linear time , 1981, Comb..
[32] Louis A. Hageman,et al. Iterative Solution of Large Linear Systems. , 1971 .
[33] Edward P. Markowski,et al. Conditions for the Effectiveness of a Preliminary Test of Variance , 1990 .
[34] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[35] Henry Hoffmann,et al. Quality of service profiling , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.
[36] Tor A. Ramstad,et al. Hybrid KLT-SVD image compression , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[37] G. S. Lueker,et al. Bin packing can be solved within 1 + ε in linear time , 1981 .
[38] Markus Püschel,et al. Computer Generation of General Size Linear Transform Libraries , 2009, 2009 International Symposium on Code Generation and Optimization.
[39] Lotfi A. Zadeh,et al. Fuzzy logic, neural networks, and soft computing , 1993, CACM.
[40] Alan Edelman,et al. Autotuning multigrid with PetaBricks , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[41] José M. F. Moura,et al. Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Alogorithms , 2004, Int. J. High Perform. Comput. Appl..
[42] Martin Rinard,et al. Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures , 2009 .