Energy Efficient Computing on Multi-core Processors: Vectorization and Compression Techniques
暂无分享,去创建一个
[1] Yoonho Park,et al. Data access optimization in a processing-in-memory system , 2015, Conf. Computing Frontiers.
[2] Lasse Natvig,et al. Performance and Energy Efficiency Analysis of Data Reuse Transformation Methodology on Multicore Processor , 2012, Euro-Par Workshops.
[3] Katrin Baumgartner. Custom Memory Management Methodology Exploration Of Memory Organisation For Embedded Multimedia System Design , 2016 .
[4] Gihan R. Mudalige,et al. Vectorizing Unstructured Mesh Computations for Many-core Architectures , 2014, PMAM.
[5] Peter Pirsch,et al. Array architectures for block matching algorithms , 1989 .
[6] Laxmikant V. Kalé,et al. Optimizing Data Locality for Fork/Join Programs Using Constrained Work Stealing , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[7] Hugo De Man,et al. Formalized methodology for data reuse: exploration for low-power hierarchical memory mappings , 1998, IEEE Trans. Very Large Scale Integr. Syst..
[8] Pradeep Dubey,et al. Closing the Ninja Performance Gap through Traditional Programming and Compiler Technology , 2012 .
[9] Mario Badr,et al. Load Value Approximation , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[10] S. Nikolaidis,et al. The Effect of Data-Reuse Transformations on Multimedia Applications for Application Specific Processors , 2005, 2005 IEEE Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications.
[11] Mahmut T. Kandemir,et al. Studying inter-core data reuse in multicores , 2011, SIGMETRICS '11.
[12] Geeta Sikka,et al. A Study on Vectorization Methods for Multicore SIMD Architecture Provided by Compilers , 2014 .
[13] Jack J. Dongarra,et al. A Step towards Energy Efficient Computing: Redesigning a Hydrodynamic Application on CPU-GPU , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[14] Krste Asanovic,et al. Vector Processors for Energy-Efficient Embedded Systems , 2016, MES@ISCA.
[15] Margaret H. Wright,et al. The opportunities and challenges of exascale computing , 2010 .
[16] Avinash Sodani,et al. Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition 2nd Edition , 2016 .
[17] Luca Benini,et al. Integrated task scheduling and data assignment for SDRAMs in dynamic applications , 2004, IEEE Design & Test of Computers.
[18] Guang R. Gao,et al. Optimizing the Fast Fourier Transform on a Multi-core Architecture , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.
[19] Alexander Chatzigeorgiou,et al. Evaluating the Effect of Data-Reuse Transformations on Processor Power Consumption , 2001 .
[20] Erik Brockmeyer,et al. Data Access and Storage Management for Embedded Programmable Processors , 2002, Springer US.
[21] Magnus Jahre,et al. Optimized hardware for suboptimal software: The case for SIMD-aware benchmarks , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[22] Constantinos E. Goutis,et al. DATA-REUSE EXPLORATION FOR LOW-POWER REALIZATION OF MULTIMEDIA APPLICATIONS ON EMBEDDED CORES , 1999 .
[23] Yunsong Li,et al. High-Throughput Power-Efficient VLSI Architecture of Fractional Motion Estimation for Ultra-HD HEVC Video Encoding , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[24] Christoforos E. Kozyrakis,et al. Models and Metrics to Enable Energy-Efficiency Optimizations , 2007, Computer.
[25] Jörg Ott,et al. RTP Payload Format for ITU-T Rec. H.263 Video , 2007, RFC.
[26] Mats Brorsson,et al. A Comparison of some recent Task-based Parallel Programming Models , 2010 .
[27] Borko Furht,et al. Parallel programming for multimedia applications , 2010, Multimedia Tools and Applications.
[28] Lasse Natvig,et al. Case Studies of Multi-core Energy Efficiency in Task Based Programs , 2012, ICT-GLOW.
[29] Magnus Jahre,et al. ParVec: vectorizing the PARSEC benchmark suite , 2015, Computing.