Selecting linear algebra kernel composition using response time prediction
暂无分享,去创建一个
[1] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[2] Erich Strohmaier,et al. A genetic algorithms approach to modeling the performance of memory-bound computations , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).
[3] José Meseguer,et al. Order-Sorted Algebra I: Equational Deduction for Multiple Inheritance, Overloading, Exceptions and Partial Operations , 1992, Theor. Comput. Sci..
[4] Jesús Labarta,et al. A Framework for Performance Modeling and Prediction , 2002, ACM/IEEE SC 2002 Conference (SC'02).
[5] Lieven Eeckhout,et al. Performance prediction based on inherent program similarity , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[6] Laura Grigori,et al. Towards an accurate performance modeling of parallel sparse factorization , 2006, Applicable Algebra in Engineering, Communication and Computing.
[7] Paolo Bientinesi,et al. Modeling performance through memory-stalls , 2012, PERV.
[8] Marc Pantel,et al. Advanced service trading for scientific computing over the grid , 2009, The Journal of Supercomputing.
[9] Archana Ganapathi,et al. Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning , 2009, 2009 IEEE 25th International Conference on Data Engineering.
[10] Joshua Alspector,et al. Data duplication: an imbalance problem ? , 2003 .
[11] Ronald L. Rivest,et al. Introduction to Algorithms, third edition , 2009 .
[12] Jack J. Dongarra,et al. Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..
[13] Wayne Snyder,et al. Complete Sets of Transformations for General E-Unification , 1989, Theor. Comput. Sci..
[14] Luca Padovani,et al. HELM and the Semantic Math-Web , 2001, TPHOLs.
[15] Elmar Peise. Hierarchical Performance Modeling for Ranking Dense Linear Algebra Algorithms , 2012, ArXiv.
[16] James Demmel,et al. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology , 1997, ICS '97.
[17] Aurélie Hurault,et al. Intelligent Service Trading and Brokering for Distributed Network Services in GridSolve , 2010, VECPAR.
[18] J. Demmel,et al. Sun Microsystems , 1996 .
[19] Rudolf Eigenmann,et al. Context-sensitive domain-independent algorithm composition and selection , 2006, PLDI '06.
[20] Jack Dongarra,et al. Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects , 2009 .
[21] Chetan Gupta,et al. PQR: Predicting Query Execution Times for Autonomous Workload Management , 2008, 2008 International Conference on Autonomic Computing.
[22] Katherine Yelick,et al. OSKI: A library of automatically tuned sparse matrix kernels , 2005 .
[23] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[24] RWTH Aachen,et al. Hierarchical Performance Modeling for Ranking Dense Linear Algebra Algorithms , 2012 .
[25] James Demmel,et al. ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[26] Robert A. van de Geijn,et al. The libflame Library for Dense Matrix Computations , 2009, Computing in Science & Engineering.
[27] Fabien L. Gandon,et al. A Machine Learning Approach to SPARQL Query Performance Prediction , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).
[28] Nancy M. Amato,et al. A framework for adaptive algorithm selection in STAPL , 2005, PPoPP.
[29] Sally A. McKee,et al. An Approach to Performance Prediction for Parallel Applications , 2005, Euro-Par.
[30] Sally A. McKee,et al. Machine learning based online performance prediction for runtime parallelization and task scheduling , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[31] Michael R. Lowry,et al. Deductive Composition of Astronomical Software from Subroutine Libraries , 1994, CADE.
[32] Chun Chen,et al. Autotuning and Specialization: Speeding up Matrix Multiply for Small Matrices with Compiler Technology , 2010, Software Automatic Tuning, From Concepts to State-of-the-Art Results.
[33] P. Sadayappan,et al. Annotation-based empirical performance tuning using Orio , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[34] Paolo Bientinesi,et al. Performance Modeling for Dense Linear Algebra , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[35] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .
[36] Jack J. Dongarra,et al. A Note on Auto-tuning GEMM for GPUs , 2009, ICCS.
[37] Ed Anderson,et al. LAPACK Users' Guide , 1995 .
[38] Xin-She Yang,et al. Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.
[39] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .
[40] Jack Dongarra,et al. LAPACK Users' guide (third ed.) , 1999 .
[41] Shoaib Kamil,et al. OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[42] John M. Mellor-Crummey,et al. Cross-architecture performance predictions for scientific applications using parameterized models , 2004, SIGMETRICS '04/Performance '04.