Optimizing NANOS OpenMP for the IBM Cyclops multithreaded architecture
暂无分享,去创建一个
Eduard Ayguadé | José E. Moreira | Jesús Labarta | Xavier Martorell | George Almási | José G. Castaños | Calin Cascaval | David Ródenas
[1] Susan J. Eggers,et al. The effectiveness of multiple hardware contexts , 1994, ASPLOS VI.
[2] Dean M. Tullsen,et al. Simultaneous multithreading: a platform for next-generation processors , 1997, IEEE Micro.
[3] Mitsuhisa Sato,et al. Design of OpenMP Compiler for an SMP Cluster , 1999 .
[4] Ajay K. Royyuru,et al. Blue Gene: A vision for protein science using a petaflop supercomputer , 2001, IBM Syst. J..
[5] Eduard Ayguadé,et al. NanosCompiler: supporting flexible multilevel parallelism exploitation in OpenMP , 2000 .
[6] Nader Bagherzadeh,et al. Performance study of a multithreaded superscalar microprocessor , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.
[7] Willy Zwaenepoel,et al. OpenMP on Networks of Workstations , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[8] Marco Zagha,et al. OriginTM 2000 and Onyx2® Performance Tuning and Optimization Guide , 1993 .
[9] Mario Nemirovsky,et al. Increasing superscalar performance through multistreaming , 1995, PACT.
[10] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..
[11] Eduard Ayguadé,et al. Evaluation of OpenMP for the Cyclops Multithreaded Architecture , 2003, WOMPAT.
[12] H. Jin,et al. - 3-The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance , 1999 .
[13] S. Parekh,et al. Tuning Compiler Optimizations for Simultaneous Multithreading , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[14] Eduard Ayguadé,et al. A Library Implementation of the Nano-Threads Programming Model , 1996, Euro-Par, Vol. II.
[15] Milind Girkar,et al. Parafrase-2: an Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors , 1989, Int. J. High Speed Comput..
[16] Allan Snavely,et al. DATA INTENSIVE VOLUME VISUALIZATION ON THE TERA MTA AND CRAY T � , 1999 .
[17] Mauricio J. Serrano,et al. Performance estimation of multistreamed, superscalar processors , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.
[18] Eduard Ayguadé,et al. Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors , 1999, ICS '99.
[19] Eduard Ayguadé,et al. NanosCompiler: supporting flexible multilevel parallelism exploitation in OpenMP , 2000, Concurr. Pract. Exp..
[20] Larry Carter,et al. Multi-processor Performance on the Tera MTA , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[21] Balaram Sinharoy,et al. Design and implementation of the POWER5 microprocessor , 2004, Proceedings. 41st Design Automation Conference, 2004..
[22] José E. Moreira,et al. Dissecting Cyclops: a detailed analysis of a multithreaded architecture , 2003, CARN.
[23] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[24] Constantine D. Polychronopoulos,et al. α-coral: a multigrain, multithreaded processor architecture , 2001, ICS '01.