A methodology correlating code optimizations with data memory accesses, execution time and energy consumption
暂无分享,去创建一个
[1] Henk Corporaal,et al. Layer assignment techniques for low energy in multi-layered memory organisations , 2003 .
[2] Erik Brockmeyer,et al. Layer assignment techniques for low energy in multi-layered memory organisations , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.
[3] Sanjay V. Rajopadhye,et al. Multi-level tiling: M for the price of one , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[4] Francky Catthoor,et al. Array Interleaving—An Energy-Efficient Data Layout Transformation , 2015, TODE.
[5] Voros Nikolaos,et al. Combining Software Cache Partitioning and Loop Tiling for Effective Shared Cache Management , 2018 .
[6] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[7] Mary W. Hall,et al. CHiLL : A Framework for Composing High-Level Loop Transformations , 2007 .
[8] Gianluca Palermo,et al. Predictive modeling methodology for compiler phase-ordering , 2016, PARMA-DITAM '16.
[9] Gianluca Palermo,et al. MiCOMP: Mitigating the Compiler Phase-Ordering Problem Using Optimization Sub-Sequences and Machine Learning , 2017, TACO.
[10] Francky Catthoor,et al. Incremental hierarchical memory size estimation for steering of loop transformations , 2007, TODE.
[11] Nikos S. Voros,et al. Combining Software Cache Partitioning and Loop Tiling for Effective Shared Cache Management , 2018, ACM Trans. Embed. Comput. Syst..
[12] Chen Ding,et al. Defensive loop tiling for shared cache , 2013, Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[13] Gary S. Tyson,et al. Practical exhaustive optimization phase order exploration and evaluation , 2009, TACO.
[14] Douglas L. Jones,et al. Fast searches for effective optimization phase sequences , 2004, PLDI '04.
[15] Mark Stephenson,et al. Predicting unroll factors using supervised classification , 2005, International Symposium on Code Generation and Optimization.
[16] Gianluca Palermo,et al. COBAYN: Compiler Autotuning Framework Using Bayesian Networks , 2016, ACM Trans. Archit. Code Optim..
[17] João M. P. Cardoso,et al. Use of Previously Acquired Positioning of Optimizations for Phase Ordering Exploration , 2015, SCOPES.
[18] Sameer Kulkarni,et al. An evaluation of different modeling techniques for iterative compilation , 2011, 2011 Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES).
[19] Michael F. P. O'Boyle,et al. The effect of cache models on iterative compilation for combined tiling and unrolling , 2004, Concurr. Comput. Pract. Exp..
[20] Pavlos Petoumenos,et al. Minimizing the cost of iterative compilation with active learning , 2017, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[21] Peter M. W. Knijnenburg,et al. Automatic selection of compiler options using non-parametric inferential statistics , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).
[22] Suresh Purini,et al. Finding good optimization sequences covering program space , 2013, TACO.
[23] Michael F. P. O'Boyle,et al. Rapidly Selecting Good Compiler Optimizations using Performance Counters , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[24] João M. P. Cardoso,et al. Compiler Phase Ordering as an Orthogonal Approach for Reducing Energy Consumption , 2018, ArXiv.
[25] Christian Lengauer,et al. Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation , 2012, Parallel Process. Lett..
[26] Wen-mei W. Hwu,et al. Data Layout Transformation Exploiting Memory-Level Parallelism in Structured Grid Many-Core Applications , 2010, International Journal of Parallel Programming.
[27] Sameer Kulkarni,et al. Mitigating the compiler optimization phase-ordering problem using machine learning , 2012, OOPSLA '12.
[28] Michael F. P. O'Boyle,et al. Evaluating Iterative Compilation , 2002, LCPC.
[29] Gianluca Palermo,et al. A Survey on Compiler Autotuning using Machine Learning , 2018, ACM Comput. Surv..
[30] Ghassan Shobaki,et al. Preallocation instruction scheduling with register pressure minimization using a combinatorial optimization approach , 2013, ACM Trans. Archit. Code Optim..
[31] Francky Catthoor,et al. Survey of Low-Energy Techniques for Instruction Memory Organisations in Embedded Systems , 2012, Journal of Signal Processing Systems.
[32] Meikang Qiu,et al. Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP , 2008, J. Parallel Distributed Comput..
[33] Albert Cohen,et al. Polyhedral-Model Guided Loop-Nest Auto-Vectorization , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.
[34] Dawei Wang,et al. APC: A Novel Memory Metric and Measurement Methodology for Modern Memory Systems , 2014, IEEE Transactions on Computers.
[35] Markus Püschel,et al. Offline library adaptation using automatically generated heuristics , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[36] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[37] Mahmut T. Kandemir,et al. Optimizing shared cache behavior of chip multiprocessors , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[38] Toshio Endo,et al. An Autotuning Framework for Scalable Execution of Tiled Code via Iterative Polyhedral Compilation , 2019, ACM Trans. Archit. Code Optim..
[39] Michele Tartara,et al. Parallel iterative compilation: using MapReduce to speedup machine learning in compilers , 2012, MapReduce '12.
[40] Albert Cohen,et al. Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time , 2007, International Symposium on Code Generation and Optimization (CGO'07).
[41] Prasanna Balaprakash,et al. An Experimental Study of Global and Local Search Algorithms in Empirical Performance Tuning , 2012, VECPAR.
[42] Michael F. P. O'Boyle,et al. Automatic feature generation for machine learning-based optimising compilation , 2014, ACM Trans. Archit. Code Optim..
[43] Albert Cohen,et al. Iterative optimization in the polyhedral model: part ii, multidimensional time , 2008, PLDI '08.
[44] Sally A. McKee,et al. ROSE::FTTransform - A source-to-source translation framework for exascale fault-tolerance research , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN 2012).
[45] Kedar S. Namjoshi,et al. Loopy: Programmable and Formally Verified Loop Transformations , 2016, SAS.
[46] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[47] Mahmut T. Kandemir,et al. On-chip cache hierarchy-aware tile scheduling for multicore machines , 2011, International Symposium on Code Generation and Optimization (CGO 2011).
[48] Michael F. P. O'Boyle,et al. Automatic Feature Generation for Machine Learning Based Optimizing Compilation , 2009, 2009 International Symposium on Code Generation and Optimization.
[49] Richard W. Vuduc,et al. POET: Parameterized Optimizations for Empirical Tuning , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[50] Uday Bondhugula,et al. Loop transformations: convexity, pruning and optimization , 2011, POPL '11.
[51] Henk Corporaal,et al. Trade-offs in loop transformations , 2009, TODE.
[52] Sanjay V. Rajopadhye,et al. Parameterized tiled loops for free , 2007, PLDI '07.
[53] Stefano Crespi-Reghizzi,et al. Continuous learning of compiler heuristics , 2013, TACO.
[54] Xing Zhou,et al. Hierarchical overlapped tiling , 2012, CGO '12.
[55] Dawei Wang,et al. Concurrent Average Memory Access Time , 2014, Computer.
[56] Olaf Krzikalla,et al. Scout: A Source-to-Source Transformator for SIMD-Optimizations , 2011, Euro-Par Workshops.
[57] P. Sadayappan,et al. Annotation-based empirical performance tuning using Orio , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[58] Uday Bondhugula,et al. PLuTo: A Practical and Fully Automatic Polyhedral Program Optimization System , 2015 .