Exploration of the Influence of Program Inputs on CMP Co-scheduling

Recent studies have showed the effectiveness of job co-scheduling in alleviating shared-cache contention on Chip Multiprocessors. Although program inputs affect cache usage and thus cache contention significantly, their influence on co-scheduling remains unexplored. In this work, we measure that influence and show that the ability to adapt to program inputs is important for a co-scheduler to work effectively on Chip Multiprocessors. We then conduct an exploration in addressing the influence by constructing cross-input predictive models for some memory behaviors that are critical for a recently proposed co-scheduler. The exploration compares the effectiveness of both linear and non-linear regression techniques in the model building. Finally, we conduct a systematic measurement of the sensitivity of co-scheduling on the errors of the predictive behavior models. The results demonstrate the potential of the predictive models in guiding contention-aware co-scheduling.

[1]  Feng Mao,et al.  Modeling Relations between Inputs and Dynamic Behavior for General Programs , 2007, LCPC.

[2]  Dean M. Tullsen,et al.  Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.

[3]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[4]  Xiao Zhang,et al.  Processor Hardware Counter Statistics as a First-Class System Resource , 2007, HotOS.

[5]  Alex Settle,et al.  Architectural Support for Enhanced SMT Job Scheduling , 2004, IEEE PACT.

[6]  J. N. Amaral,et al.  Benchmark Design for Robust Profile-Directed Optimization , 2007 .

[7]  M TullsenDean,et al.  Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000 .

[8]  David A. Padua,et al.  A dynamically tuned sorting library , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[9]  Chen Ding,et al.  Miss Rate Prediction Across Program Inputs and Cache Configurations , 2007, IEEE Transactions on Computers.

[10]  Dean M. Tullsen,et al.  Initial observations of the simultaneous multithreading Pentium 4 processor , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.

[11]  Chen Ding,et al.  Locality phase prediction , 2004, ASPLOS XI.

[12]  Margo I. Seltzer,et al.  Performance of Multithreaded Chip Multiprocessors and Implications for Operating System Design , 2005, USENIX Annual Technical Conference, General Track.

[13]  Sandhya Dwarkadas,et al.  Compatible phase co-scheduling on a CMP of multi-threaded processors , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[14]  Michael D. Smith,et al.  Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[15]  Yan Solihin,et al.  Predicting inter-thread cache contention on a chip multi-processor architecture , 2005, 11th International Symposium on High-Performance Computer Architecture.

[16]  Yutao Zhong,et al.  Predicting whole-program locality through reuse distance analysis , 2003, PLDI '03.

[17]  Yunlian Jiang,et al.  CAPS: Contention-Aware Proactive Scheduling for CMPs , 2007 .

[18]  George Ho,et al.  PAPI: A Portable Interface to Hardware Performance Counters , 1999 .