Correctly implementing value prediction in microprocessors that support multithreading or multiprocessing

This paper explores the interaction of value prediction with thread-level parallelism techniques, including multithreading and multiprocessing, where correctness is defined by a memory consistency model. Value prediction subtly interacts with the memory consistency model by allowing data dependent instructions to be reordered. We find that predicting a value and later verifying that the value eventually calculated is the same as the value predicted is not always sufficient.We present an example of a multithreaded pointer manipulation that can generate a surprising and erroneous result when value prediction is implemented without considering memory consistency correctness. We show that this problem can occur with real software, and we discuss how to apply existing techniques to eliminate the problem in both sequentially consistent systems and systems that obey relaxed memory consistency models.

[1]  Luiz André Barroso,et al.  Piranha: a scalable architecture based on single-chip multiprocessing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[2]  V. Rich Personal communication , 1989, Nature.

[3]  S. Parekh,et al.  An analysis of database workload performance on simultaneous multithreaded processors , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[4]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[5]  Francisco Corella,et al.  Specification of the powerpc shared memory architecture , 1993 .

[6]  Anoop Gupta,et al.  Two Techniques to Enhance the Performance of Memory Consistency Models , 1991, ICPP.

[7]  Mikko H. Lipasti,et al.  An architectural evaluation of Java TPC-W , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[8]  Todd M. Austin,et al.  Efficient checker processor design , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.

[9]  I. G. BONNER CLAPPISON Editor , 1960, The Electric Power Engineering Handbook - Five Volume Set.

[10]  R.E. Johnson,et al.  Evaluation of Multithreaded Uniprocessors for Commercial Application Environments , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[11]  Luiz André Barroso,et al.  Memory system characterization of commercial workloads , 1998, ISCA.

[12]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[13]  Steven R. Kunkel,et al.  A multithreaded PowerPC processor for commercial servers , 2000, IBM J. Res. Dev..

[14]  Anoop Gupta,et al.  Parallel computer architecture - a hardware / software approach , 1998 .

[15]  Sarita V. Adve,et al.  Shared Memory Consistency Models: A Tutorial , 1996, Computer.

[16]  Hewlett-Packard THE HP PA-8000 RISC CPU , 2022 .

[17]  Richard E. Kessler,et al.  The Alpha 21264 microprocessor , 1999, IEEE Micro.

[18]  Dean M. Tullsen,et al.  Simultaneous multithreading: a platform for next-generation processors , 1997, IEEE Micro.

[19]  Todd M. Austin,et al.  DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[20]  Mikko H. Lipasti,et al.  On the value locality of store instructions , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[21]  Kenneth C. Yeager The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.

[22]  Trevor Mudge,et al.  Thread-level parallelism and interactive performance of desktop applications , 2000, SIGP.

[23]  Keith Diefendorff,et al.  Power4 focuses on memory bandwidth , 1999 .

[24]  J EggersSusan,et al.  An analysis of operating system behavior on a simultaneous multithreaded architecture , 2000 .

[25]  Susan J. Eggers,et al.  An analysis of database workload performance on simultaneous multithreaded processors , 1998, ISCA.

[26]  Mikko H. Lipasti,et al.  Value locality and load value prediction , 1996, ASPLOS VII.

[27]  Susan J. Eggers,et al.  An analysis of operating system behavior on a simultaneous multithreaded architecture , 2000, ASPLOS IX.