The Multi-Core Era - Trends and Challenges

Since the very beginning of hardware development, computer processors were invented with ever-increasing clock frequencies and sophisticated in-build optimization strategies. Due to physical limitations, this 'free lunch' of speedup has come to an end. The following article gives a summary and bibliography for recent trends and challenges in CMP architectures. It discusses how 40 years of parallel computing research need to be considered in the upcoming multi-core era. We argue that future research must be driven from two sides - a better expression of hardware structures, and a domain-specific understanding of software parallelism.

[1]  John L. Gustafson,et al.  Reevaluating Amdahl's law , 1988, CACM.

[2]  Andreas Polze,et al.  The Grid-Occam Project , 2004, GSEM.

[3]  Won-Taek Lim,et al.  Effective Management of DRAM Bandwidth in Multicore Processors , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[4]  David May,et al.  Communicating Process Architecture: Transputers and Occam , 1986, Future Parallel Computers.

[5]  Christian Tismer Continuations and Stackless Python , 1999 .

[6]  Wei Lu,et al.  ParaXML : A Parallel XML Processing Model on the Multicore CPUs , 2007 .

[7]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[8]  Narain H. Gehani,et al.  Concurrent C , 1986, Softw. Pract. Exp..

[9]  Frédéric Boussinot,et al.  The ESTEREL language , 1991, Proc. IEEE.

[10]  Kunle Olukotun,et al.  "Can We Still Keep the Faith?": A debate on the Future of Multi-Core Systems , 2007 .

[11]  D. Burger,et al.  Memory Bandwidth Limitations of Future Microprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[12]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[13]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008, Computer.

[14]  D. Marr,et al.  Hyper-Threading Technology Architecture and MIcroarchitecture , 2002 .

[15]  Theo Ungerer,et al.  A survey of processors with explicit multithreading , 2003, CSUR.

[16]  James E. Smith,et al.  Isolation in Commodity Multicore Processors , 2007, Computer.

[17]  Geraint Jones Programming in occam , 1986, Prentice Hall International Series in Computer Science.

[18]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[19]  Michael J. Flynn,et al.  Some Computer Organizations and Their Effectiveness , 1972, IEEE Transactions on Computers.

[20]  Timothy G. Mattson,et al.  Patterns for parallel programming , 2004 .

[21]  Michael Goldsmith,et al.  Programming in occam 2 , 1985, Prentice Hall international series in computer science.

[22]  Yan Solihin,et al.  QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.

[23]  Ian T. Foster,et al.  Designing and building parallel programs - concepts and tools for parallel software engineering , 1995 .

[24]  Santosh G. Abraham,et al.  Chip multithreading: opportunities and challenges , 2005, 11th International Symposium on High-Performance Computer Architecture.

[25]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[26]  P. Altena,et al.  In search of clusters , 2007 .

[27]  Mark D. Hill,et al.  What is scalability? , 1990, CARN.

[28]  Kunle Olukotun,et al.  The Future of Microprocessors , 2005, ACM Queue.

[29]  Alex Rapaport,et al.  Mpi-2: extensions to the message-passing interface , 1997 .

[30]  Ian Foster,et al.  Designing and building parallel programs , 1994 .

[31]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[32]  Derek Chiou,et al.  Single-Threaded vs. Multithreaded: Where Should We Focus? , 2007, IEEE Micro.

[33]  James R. Larus,et al.  Transactional Memory , 2006, Transactional Memory.

[34]  Richard M. Brown,et al.  The ILLIAC IV Computer , 1968, IEEE Transactions on Computers.

[35]  James R. Larus,et al.  Software and the Concurrency Revolution , 2005, ACM Queue.

[36]  Wen-mei W. Hwu,et al.  MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs , 2008, LCPC.

[37]  Michael R. Head,et al.  Approaching a parallelized XML parser optimized for multi-coreprocessors , 2007, SOCP '07.

[38]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[39]  Peter M. Schwarz,et al.  Experience Using Multiprocessor Systems—A Status Report , 1980, CSUR.

[40]  Jong-Deok Choi,et al.  Deterministic replay of Java multithreaded applications , 1998, SPDT '98.

[41]  Richard McDougall Extreme Software Scaling , 2005, ACM Queue.

[42]  Keir Fraser,et al.  Concurrent programming without locks , 2007, TOCS.

[43]  Rohit Chandra,et al.  Parallel programming in openMP , 2000 .

[44]  Lawrence Rauchwerger,et al.  Polaris: The Next Generation in Parallelizing Compilers , 2000 .

[45]  Herb Sutter,et al.  A Fundamental Turn Toward Concurrency in Software , 2008 .

[46]  Dean M. Tullsen,et al.  Simultaneous multithreading: a platform for next-generation processors , 1997, IEEE Micro.

[47]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[48]  Joe Armstrong,et al.  Concurrent programming in ERLANG , 1993 .

[49]  Andrew S. Tanenbaum,et al.  Distributed operating systems , 2009, CSUR.

[50]  Ralf Lämmel,et al.  Google's MapReduce programming model - Revisited , 2007, Sci. Comput. Program..