Auto-tuning Spark big data workloads on POWER8: Prediction-based dynamic SMT threading
暂无分享,去创建一个
H. Peter Hofstee | Zhen Jia | Jianfeng Zhan | Lixin Zhang | Yonghua Lin | Chao Xue | Guancheng Chen | H. P. Hofstee | Jianfeng Zhan | Lixin Zhang | Zhen Jia | Chao Xue | Guancheng Chen | Yonghua Lin
[1] Jian Li,et al. Dynamic power-performance adaptation of parallel computation on chip multiprocessors , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..
[2] John D. McCalpin,et al. Characterization of simultaneous multithreading (SMT) efficiency in POWER5 , 2005, IBM J. Res. Dev..
[3] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..
[4] A. Janiszewski,et al. Architectural support for enhanced SMT job scheduling , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..
[5] Mihai Burcea,et al. An Adaptive OpenMP Loop Scheduler for Hyperthreaded SMPs , 2004, PDCS.
[6] John C. Platt. Using Analytic QP and Sparseness to Speed Training of Support Vector Machines , 1998, NIPS.
[7] Michael Gschwind,et al. IBM POWER8 processor core microarchitecture , 2015, IBM J. Res. Dev..
[8] Lingjia Tang,et al. SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[9] AilamakiAnastasia,et al. Clearing the clouds , 2012 .
[10] Balaram Sinharoy,et al. Advanced features in IBM POWER8 systems , 2015, IBM J. Res. Dev..
[11] Michael Voss,et al. Runtime empirical selection of loop schedulers on hyperthreaded SMPs , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.
[12] Dean M. Tullsen,et al. Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.
[13] Bronis R. de Supinski,et al. Prediction models for multi-dimensional power-performance optimization on many cores , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[14] Li Zhang,et al. SparkBench: a comprehensive benchmarking suite for in memory data analytic platform Spark , 2015, Conf. Computing Frontiers.
[15] Christoforos E. Kozyrakis,et al. Dynamic management of TurboMode in modern multi-core chips , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[16] J. Brian Gray,et al. Introduction to Linear Regression Analysis , 2002, Technometrics.
[17] Peter Harrington,et al. Machine Learning in Action , 2012 .
[18] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[19] Chunjie Luo,et al. Characterizing data analysis workloads in data centers , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).
[20] Eduard Ayguadé,et al. Decomposable and responsive power models for multicore processors using performance counters , 2010, ICS '10.
[21] Jaejin Lee,et al. Adaptive execution techniques of parallel programs for multiprocessors , 2010, J. Parallel Distributed Comput..
[22] Ashutosh Kumar Singh,et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .
[23] Lieven Eeckhout,et al. Undersubscribed threading on clustered cache architectures , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[24] Jack L. Lo,et al. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[25] Timothy Creech. Efficient multiprogramming for multicores with SCAF , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[26] Dimitrios S. Nikolopoulos,et al. Online power-performance adaptation of multithreaded programs using hardware event-based prediction , 2006, ICS '06.
[27] Pradip Bose,et al. Crank it up or dial it down: Coordinated multiprocessor frequency and folding control , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[28] Yoav Freund,et al. A Short Introduction to Boosting , 1999 .
[29] Pat Langley,et al. Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.
[30] H. Peter Hofstee,et al. PATer: A Hardware Prefetching Automatic Tuner on IBM POWER8 Processor , 2016, IEEE Computer Architecture Letters.
[31] Dirk Grunwald,et al. Methods for modeling resource contention on simultaneous multithreading processors , 2005, 2005 International Conference on Computer Design.
[32] Alexandra Fedorova,et al. An SMT-Selection Metric to Improve Multithreaded Applications' Performance , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[33] Erich M. Nahum,et al. Evaluating the impact of simultaneous multithreading on network servers using real hardware , 2005, SIGMETRICS '05.
[34] Dimitrios S. Nikolopoulos,et al. Prediction-Based Power-Performance Adaptation of Multithreaded Scientific Codes , 2008, IEEE Transactions on Parallel and Distributed Systems.
[35] Yuqing Zhu,et al. BigDataBench: A big data benchmark suite from internet services , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[36] Donald Nguyen,et al. Machine learning-based prefetch optimization for data center applications , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.
[37] J. Morris Chang,et al. Performance Characterization of Java Applications on SMT Processors , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..
[38] Dean M. Tullsen,et al. Exploiting unbalanced thread scheduling for energy and performance on a CMP of SMT processors , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[39] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[40] Jaejin Lee,et al. Adaptive execution techniques for SMT multiprocessor architectures , 2005, PPOPP.
[41] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[42] D. Kibler,et al. Instance-based learning algorithms , 2004, Machine Learning.
[43] Lieven Eeckhout,et al. Automatic SMT threading for OpenMP applications on the Intel Xeon Phi co-processor , 2014, ROSS@ICS.
[44] Babak Falsafi,et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.
[45] Douglas C. Montgomery,et al. Introduction to Linear Regression Analysis, Solutions Manual (Wiley Series in Probability and Statistics) , 2007 .
[46] Stijn Eyerman,et al. Probabilistic job symbiosis modeling for SMT processor scheduling , 2010, ASPLOS XV.
[47] Chih-Jen Lin,et al. A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.