Using Simple Abstraction to Guide the Reinvention of Computing for Parallelism

The sudden shift from single-processor computer systems to many-processor parallel computing systems requires reinventing much of Computer Science (CS): how to actually build and program the new parallel systems. CS urgently requires convergence to a robust parallel general-purpose platform that provides good performance and is easy to program. Unfortunately, this same objective has eluded decades of parallel computing research. Now, continued delays and uncertainty could start affecting important sectors of the economy. This paper advocates a minimalist stepping-stone: settle first on a simple abstraction that encapsulates the new interface between programmers, on one hand, and system builders, on the other hand. This paper also makes several concrete suggestions: (i) the Immediate Concurrent Execution (ICE) abstraction as a candidate for the new abstraction, and (ii) the Explicit Multi-Threaded (XMT) general-purpose parallel platform, under development at the University of Maryland, as a possible embodiment of ICE. ICE and XMT build on a formidable body of knowledge, known as PRAM (for parallel randomaccess machine, or model) algorithmics, and a latent, though not widespread, familiarity with it. Ease-of-programming, strong speedups and other attractive properties of the approach suggest that we may be much better prepared for the challenges ahead than many realize.

[1]  Uzi Vishkin,et al.  Algorithmic approach to designing an easy-to-program system: Can it lead to a HW-enhanced programmer's workflow add-on? , 2009, 2009 IEEE International Conference on Computer Design.

[2]  Uzi Vishkin,et al.  An O(n² log n) Parallel MAX-FLOW Algorithm , 1982, J. Algorithms.

[3]  Uzi Vishkin,et al.  A pilot study to compare programming effort for two parallel programming models , 2007, J. Syst. Softw..

[4]  Margaret Kinzel,et al.  Explicating a Mechanism for Conceptual Learning: Elaborating the Construct of Reflective Abstraction , 2004 .

[5]  Christoph W. Kessler,et al.  Practical PRAM programming , 2000, Wiley series on parallel and distributed computing.

[6]  Anoop Gupta,et al.  Parallel computer architecture - a hardware / software approach , 1998 .

[7]  Krisztián Flautner,et al.  Evolution of thread-level parallelism in desktop applications , 2010, ISCA.

[8]  Joseph JáJá,et al.  An Introduction to Parallel Algorithms , 1992 .

[9]  Uzi Vishkin,et al.  A Low-Overhead Asynchronous Interconnection Network for GALS Chip Multiprocessors , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[10]  S. Sitharama Iyengar,et al.  Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.

[11]  George C. Caragea,et al.  Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform , 2006, Handbook of Parallel Computing.

[12]  Uzi Vishkin,et al.  Case study of gate-level logic simulation on an extremely fine-grained chip multiprocessor , 2006, J. Embed. Comput..

[13]  Uzi Vishkin,et al.  Is teaching parallel algorithmic thinking to high school students possible?: one teacher's experience , 2010, SIGCSE.

[14]  A. Gottleib,et al.  The nyu ultracomputer- designing a mimd shared memory parallel computer , 1983 .

[15]  Herman H. Goldstine,et al.  Preliminary discussion of the logical design of an electronic computing instrument (1946) , 1989 .

[16]  Leslie G. Valiant,et al.  A bridging model for multi-core computing , 2008, J. Comput. Syst. Sci..

[17]  Ralph Grishman,et al.  The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.

[18]  Fuat Keceli,et al.  Resource-Aware Compiler Prefetching for Many-Cores , 2010, 2010 Ninth International Symposium on Parallel and Distributed Computing.

[19]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[20]  Uzi Vishkin,et al.  Fpga-based prototype of a pram-on-chip processor , 2008, CF '08.

[21]  Yossi Matias,et al.  The Queue-Read Queue-Write Asynchronous PRAM Model , 1996, Theor. Comput. Sci..

[22]  George C. Caragea,et al.  Brief announcement: performance potential of an easy-to-program PRAM-on-chip prototype versus state-of-the-art processor , 2009, SPAA '09.

[23]  George C. Caragea,et al.  General-Purpose vs . GPU : Comparison of Many-Cores on Irregular Workloads , 2010 .

[24]  Gang Qu,et al.  Layout-Accurate Design and Implementation of a High-Throughput Interconnection Network for Single-Chip Parallel Processing , 2007 .

[25]  Pradeep Dubey,et al.  Platform 2015: Intel ® Processor and Platform Evolution for the Next Decade , 2005 .