On the energy efficiency and performance of irregular application executions on multicore, NUMA and manycore platforms

[1]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[2]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[3]  R. Madariaga Dynamics of an expanding circular fault , 1976, Bulletin of the Seismological Society of America.

[4]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[5]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[6]  G. Laporte The traveling salesman problem: An overview of exact and approximate algorithms , 1992 .

[7]  Hui Li,et al.  Locality and Loop Scheduling on NUMA Multiprocessors , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[8]  F. Collino Perfectly Matched Absorbing Layers for the Paraxial Equations , 1997 .

[9]  Inderjit S. Dhillon,et al.  A Data-Clustering Algorithm on Distributed Memory Multiprocessors , 1999, Large-Scale Parallel Data Mining.

[10]  Manish Gupta,et al.  Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors , 2000, IEEE Micro.

[11]  Mohammed J. Zaki,et al.  Large-Scale Parallel Data Mining , 2002, Lecture Notes in Computer Science.

[12]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Attila Gürsoy,et al.  Data Decomposition for Parallel K-means Clustering , 2003, Parallel Processing and Applied Mathematics.

[14]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[15]  Li Chen,et al.  Parallel simulation of strong ground motions during recent and historical damaging earthquakes in Tokyo, Japan , 2005, Parallel Comput..

[16]  P. Moczo,et al.  The finite-difference time-domain method for modeling of seismic wave propagation , 2007 .

[17]  Jean Roman,et al.  Exploiting Intensive Multithreading for the Efficient Simulation of 3D Seismic Wave Propagation , 2008, 2008 11th IEEE International Conference on Computational Science and Engineering.

[18]  Krista Rizman Zalik,et al.  An efficient k 0-means clustering algorithm , 2008 .

[19]  Jean-François Méhaut,et al.  Parallel simulations of seismic wave propagation on NUMA architectures , 2009, PARCO.

[20]  James R. Larus,et al.  Spending Moore's dividend , 2009, CACM.

[21]  E.V. Prasad,et al.  A scalable k-means clustering algorithm on Multi-Core architecture , 2009, 2009 Proceeding of International Conference on Methods and Models in Computer Science (ICM2CS).

[22]  Samuel Williams,et al.  Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors , 2007, SIAM Rev..

[23]  Dhabaleswar K. Panda,et al.  Scalable Earthquake Simulation on Petascale Supercomputers , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[24]  Antti Ylä-Jääski,et al.  Energy- and Cost-Efficiency Analysis of ARM-Based Clusters , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[25]  Hermann Härtig,et al.  Measuring energy consumption for short code paths using RAPL , 2012, PERV.

[26]  P. O. A. Navaux,et al.  Time-to-Solution and Energy-to-Solution: A Comparison between ARM and Xeon , 2012, 2012 Third Workshop on Applications for Multi-Core Architecture.

[27]  Josep Torrellas,et al.  Comparing the power and performance of Intel's SCC to state-of-the-art CPUs and GPUs , 2012, 2012 IEEE International Symposium on Performance Analysis of Systems & Software.

[28]  Efraim Rotem,et al.  Power-Management Architecture of the Intel Microarchitecture Code-Named Sandy Bridge , 2012, IEEE Micro.

[29]  Simone Secchi,et al.  Efficient Sorting on the Tilera Manycore Architecture , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.

[30]  Henrique C. Freitas,et al.  Parallel and distributed kmeans to identify the translation initiation site of proteins , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[31]  F. Dupros,et al.  Finite difference simulations of seismic wave propagation for understanding earthquake physics and predicting ground motions: Advances and challenges , 2013 .

[32]  Benoît Dupont de Dinechin,et al.  A Distributed Run-Time Environment for the Kalray MPPA®-256 Integrated Manycore Processor , 2013, ICCS.

[33]  Alex Ramírez,et al.  The low power architecture approach towards exascale computing , 2013, J. Comput. Sci..

[34]  Dirk Ribbrock,et al.  Energy efficiency vs. performance of the numerical solution of PDEs: An application study on a low-power ARM-based cluster , 2013, J. Comput. Phys..

[35]  Benoît Dupont de Dinechin,et al.  Extended Cyclostatic Dataflow Program Compilation and Execution for an Integrated Manycore Processor , 2013, ICCS.

[36]  Fabrice Dupros,et al.  On Scalability Issues of the Elastodynamics Equations on Multicore Platforms , 2013, ICCS.

[37]  Jean-François Méhaut,et al.  Analysis of computing and energy performance of multicore, NUMA, and manycore platforms for an irregular application , 2013, IA3 '13.

[38]  Jack Dongarra,et al.  Parallel Processing and Applied Mathematics , 2013, Lecture Notes in Computer Science.

[39]  Abdullah Gharaibeh,et al.  The energy case for graph processing on hybrid CPU and GPU systems , 2013, IA3 '13.

[40]  Philippe Thierry,et al.  Genetic Algorithm Based Auto-Tuning of Seismic Applications on Multi and Manycore Computers , 2014, HiPC 2014.

[41]  Wolfgang Karl,et al.  Evaluation of Adaptive Memory Management Techniques on the Tilera TILE-Gx Platform , 2014, ARCS Workshops.

[42]  Philippe Olivier Alexandre Navaux,et al.  Improving the Performance of Seismic Wave Simulations with Dynamic Load Balancing , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[43]  Gorjan Alagic,et al.  #p , 2019, Quantum information & computation.