Parallel Pattern Detection for Architectural Improvements

With the shift in general purpose computing to increasingly parallel architectures comes a need for clever architectures to achieve high parallelism on previously sequential or poorly parallelized code. In order to fully utilize the many-core systems of the present and future, a shift must occur in architecture design philosophy to understanding how the parallel programming process affects design decisions. Parallel patterns provide a way to create parallel code for a wide variety of algorithms. Additionally they provide a convenient classification mechanism that is both understandable to programmers and that exhibit similar behaviors that can be architecturally exploited. In this work we explore the capabilities of pattern driven dynamic architectures as well as detection mechanisms useful for dynamic and static parallel pattern recognition.

[1]  Renato J. O. Figueiredo,et al.  Impact of heterogeneity on DSM performance , 2000, Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550).

[2]  Willy Zwaenepoel,et al.  Munin: distributed shared memory based on type-specific memory coherence , 1990, PPOPP '90.

[3]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[4]  Margaret Martonosi,et al.  Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors , 2009, ISCA '09.

[5]  Yale N. Patt,et al.  Feedback-directed pipeline parallelism , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[6]  Emilio Luque,et al.  AMTHA: An Algorithm for Automatically Mapping Tasks to Processors in Heterogeneous Multiprocessor Architectures , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[7]  Francisco J. Cazorla,et al.  A Flexible Heterogeneous Multi-Core Architecture , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[8]  Manuel Prieto,et al.  A comprehensive scheduler for asymmetric multicore systems , 2010, EuroSys '10.

[9]  Chita R. Das,et al.  Aérgia: exploiting packet latency slack in on-chip networks , 2010, ISCA.

[10]  Pattern-Aware Dynamic Thread Mapping Mechanisms for Asymmetric Manycore Architectures , 2010 .

[11]  Sau-Ming Lau,et al.  An adaptive load balancing algorithm for heterogeneous distributed systems with multiple task classes , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[12]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[13]  Jorge L. Ortega-Arjona Patterns for Parallel Software Design , 2010 .

[14]  Ran Ginosar,et al.  Efficient Link Capacity and QoS Design for Wormhole Network-on-Chip , 2005 .

[15]  Wenguang Chen,et al.  Do I use the wrong definition?: DeFuse: definition-use invariants for detecting concurrency and sequential bugs , 2010, OOPSLA.

[16]  William Thies,et al.  A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).

[17]  Ran Ginosar,et al.  Efficient Link Capacity and QoS Design for Network-on-Chip , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[18]  Timothy G. Mattson,et al.  Patterns for parallel programming , 2004 .

[19]  Hyesoon Kim,et al.  Age based scheduling for asymmetric multiprocessors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[20]  Sudhakar Yalamanchili,et al.  Ocelot: A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[21]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[22]  Martin Rinard,et al.  Proceedings of the ACM international conference on Object oriented programming systems languages and applications , 2010 .

[23]  Koushik Sen,et al.  Asserting and checking determinism for multithreaded programs , 2009, ESEC/FSE '09.

[24]  Tong Li,et al.  Efficient operating system scheduling for performance-asymmetric multi-core architectures , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).