Experiences Understanding Performance in a Commercial Scale-Out Environment

Clusters of loosely connected machines are becoming an important model for commercial computing. The cost/performance ratio makes these scale-out solutions an attractive platform for a class of computational needs. The work we describe in this paper focuses on understanding performance when using a scale-out environment to run commercial workloads. We describe the novel scale-out environment we configured and the workload we ran on it. We explain the unique performance challenges faced in such an environment and the tools we applied and improved for this environment to address the challenges. We present data from the tools that proved useful in optimizing performance on our system. We discuss the lessons we learned applying and modifying existing tools to a commercial scale-out environment, and offer insights into making future performance tools effective in this environment.

[1]  M. Desnoyers,et al.  The LTTng tracer: A low impact performance and behavior monitor for GNU/Linux , 2006 .

[2]  R.W. Wisniewski,et al.  Efficient, Unified, and Scalable Performance Monitoring for Multiprocessor Operating Systems , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[3]  Min Zhou,et al.  Experiences and lessons learned with a portable interface to hardware performance counters , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[4]  Michael J. Cafarella,et al.  Building Nutch: Open Source Search , 2004, ACM Queue.

[5]  Jeanine Cook,et al.  Improved estimation for software multiplexing of performance counters , 2005, 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[6]  John M. May,et al.  MPX: Software for multiplexing hardware performance counters in multithreaded programs , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[7]  Luiz André Barroso,et al.  Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[8]  Hans Werner Meuer,et al.  Top500 Supercomputer Sites , 1997 .

[9]  Michael Stumm,et al.  Online performance analysis by statistical sampling of microprocessor performance counters , 2005, ICS '05.

[10]  Lance M. Berc,et al.  Continuous profiling: where have all the cycles gone? , 1997, ACM Trans. Comput. Syst..

[11]  Yong Luo,et al.  Performance Evaluation of the SGI Origin2000: A Memory-Centric Characterization of LANL ASCI Applications , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[12]  Dilma Da Silva,et al.  Libra: a library operating system for a jvm in a virtualized execution environment , 2007, VEE '07.

[13]  Michel Dagenais,et al.  System Administration: The Linux Trace Toolkit , 2000 .