System noise, OS clock ticks, and fine-grained parallel applications

As parallel jobs get bigger in size and finer in granularity, "system noise" is increasingly becoming a problem. In fact, fine-grained jobs on clusters with thousands of SMP nodes run faster if a processor is intentionally left idle (per node), thus enabling a separation of "system noise" from the computation. Paying a cost in average processing speed at a node for the sake of eliminating occasional processes delays is (unfortunately) beneficial, as such delays are enormously magnified when one late process holds up thousands of peers with which it synchronizes.We provide a probabilistic argument showing that, under certain conditions, the effect of such noise is linearly proportional to the size of the cluster (as is often empirically observed). We then identify a major source of noise to be indirect overhead of periodic OS clock interrupts ("ticks"), that are used by all general-purpose OSs as a means of maintaining control. This is shown for various grain sizes, platforms, tick frequencies, and OSs. To eliminate such noise, we suggest replacing ticks with an alternative mechanism we call "smart timers". This turns out to also be in line with needs of desktop and mobile computing, increasing the chances of the suggested change to be accepted.

[1]  M. Malik,et al.  Operating Systems , 1992, Lecture Notes in Computer Science.

[2]  John Lions,et al.  Lions' Commentary on UNIX 6th Edition, with Source Code , 1976, Computer classics revisited.

[3]  Raphael A. Finkel,et al.  An Operating Systems Vade Mecum , 1986 .

[4]  Raphael A. Finkel An operating systems vade mecum (2. ed.) , 1988 .

[5]  Larry Rudolph,et al.  Gang Scheduling Performance Benefits for Fine-Grain Synchronization , 1992, J. Parallel Distributed Comput..

[6]  J. Duane Northcutt,et al.  SVR4UNIX Scheduler Unacceptable for Multimedia Applications , 1993, NOSSDAV.

[7]  Ronald Mraz,et al.  Reducing the variance of point to point transfers in the IBM 9076 parallel computer , 1994, Proceedings of Supercomputing '94.

[8]  B. O. Gallmeister,et al.  POSIX.4 - programming for the real world , 1995 .

[9]  Robin Fairbairns,et al.  The Design and Implementation of an Operating System to Support Distributed Multimedia Applications , 1996, IEEE J. Sel. Areas Commun..

[10]  Hans Werner Meuer,et al.  Top500 Supercomputer Sites , 1997 .

[11]  B. Srinivasan,et al.  A firm real-time system implementation using commercial off-the-shelf hardware and free software , 1998, Proceedings. Fourth IEEE Real-Time Technology and Applications Symposium (Cat. No.98TB100245).

[12]  Abraham Silberschatz,et al.  The Pebble Component-Based Operating System , 1999, USENIX Annual Technical Conference, General Track.

[13]  Peter Druschel,et al.  Soft timers: efficient microsecond software timer support for network processing , 1999, SOSP.

[14]  D. Feitelson,et al.  Time Stamp Counters Library - Measurements with Nano Seconds Resolution , 2000 .

[15]  Fabrizio Petrini,et al.  A general predictive performance model for wavefront algorithms on clusters of SMPs , 2000, Proceedings 2000 International Conference on Parallel Processing.

[16]  A. Benjamin Perlman,et al.  Rail Passenger Equipment Crashworthiness Testing Requirements and Implementation , 2000, Rail Transportation.

[17]  Jack Dongarra,et al.  TOP500 Supercomputer sites 11/2000 , 2000 .

[18]  Fabrizio Petrini,et al.  Predictive Performance and Scalability Modeling of a Large-Scale Application , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[19]  Stephen Childs,et al.  The Linux-SRT integrated multimedia operating system: bringing QoS to the desktop , 2001, Proceedings Seventh IEEE Real-Time Technology and Applications Symposium.

[20]  Scott A. Brandt,et al.  BEST scheduler for integrated processing of best-effort and soft real-time processes , 2001, IS&T/SPIE Electronic Imaging.

[21]  Jonathan Walpole,et al.  Supporting time-sensitive applications on a commodity OS , 2002, OPSR.

[22]  Yoshikatsu Tada John Lions : Lions' Commentary on UNIX 6th Edition with Source Code , 2002 .

[23]  Laxmikant V. Kalé,et al.  NAMD: Biomolecular Simulation on Thousands of Processors , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[24]  J. Fier,et al.  Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[25]  F. Petrini,et al.  The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[26]  Lizy Kurian John,et al.  Run-time modeling and estimation of operating system power consumption , 2003, SIGMETRICS '03.

[27]  Jason Nieh,et al.  A SMART scheduler for multimedia applications , 2003, TOCS.

[28]  Dan Tsafrir,et al.  Effects of clock resolution on the scheduling of interactive and soft real-time processes , 2003, SIGMETRICS '03.

[29]  R. Gioiosa,et al.  Analysis of system overhead on parallel computers , 2004, Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004..

[30]  Intel Corportation,et al.  IA-32 Intel Architecture Software Developers Manual , 2004 .

[31]  Uwe Schwiegelshohn,et al.  Parallel Job Scheduling - A Status Report , 2004, JSSPP.

[32]  Paul Terry,et al.  Improving application performance on HPC systems with process synchronization , 2004 .

[33]  Arthur B. Maccabe FAST-OS: forum to address scalable technology for runtime and operating systems , 2006, SC.

[34]  D. Feitelson,et al.  General-Purpose Timing : The Failure of Periodic Timers , .