HiperJobViz: Visualizing Resource Allocations in High-Performance Computing Center via Multivariate Health-Status Data

Scheduling, visualizing, and balancing resource allocations in High-Performance Computing Centers are complicated tasks due to a large amount of data and the dynamic natures of the resource allocation problem. This paper introduces HiperJobViz, a visual analytic tool for visualizing the resource allocations of data centers for jobs, users, and usage statistics. The goals of this tool are: 1) to provide an overview of the current resource usages, 2) to track changes of resource usages by users, jobs, and hosts and 3) to provide a detailed view of the resource usage via multi-dimensional representation of health metrics, such as CPU temperatures, memory usage, and power consumption. To support these goals, our visual analytics tool provides a full range of interactive features, including details on demands, brushing and links, and filtering.

[1]  Kejiang Ye,et al.  Imbalance in the cloud: An analysis on Alibaba cluster trace , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[2]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[3]  Jeremy Kepner,et al.  Scheduler technologies in support of high performance data analysis , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[4]  Francine Berman,et al.  High-performance schedulers , 1998 .

[5]  John T. Stasko,et al.  Low-level components of analytic activity in information visualization , 2005, IEEE Symposium on Information Visualization, 2005. INFOVIS 2005..

[6]  Hans-Peter Seidel,et al.  An Edge-Bundling Layout for Interactive Parallel Coordinates , 2014, 2014 IEEE Pacific Visualization Symposium.

[7]  Mikko Majanen,et al.  Energy-aware job scheduler for high-performance computing , 2012, Computer Science - Research and Development.

[8]  J. Tindle,et al.  Job scheduling in a high performance computing environment , 2013, 2013 International Conference on High Performance Computing & Simulation (HPCS).

[9]  Georges Da Costa,et al.  2005 IEEE International Symposium on Cluster Computing and the Grid , 2005, CCGRID.

[10]  Wolfgang Barth,et al.  Nagios: System and Network Monitoring , 2006 .

[11]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[12]  Zhibin Yu,et al.  The Elasticity and Plasticity in Semi-Containerized Co-locating Cloud Workload: a View from Alibaba Trace , 2018, SoCC.

[13]  Gennady L. Andrienko,et al.  Exploratory spatio-temporal visualization: an analytical review , 2003, J. Vis. Lang. Comput..

[14]  Cong Li,et al.  Zeno: A Straggler Diagnosis System for Distributed Computing Using Machine Learning , 2018, ISC.

[15]  Tamara Munzner,et al.  MizBee: A Multiscale Synteny Browser , 2009, IEEE Transactions on Visualization and Computer Graphics.

[16]  Ngan V. T. Nguyen,et al.  HiperViz: Interactive visualization of CPU Temperatures in High Performance Computing Centers , 2019, PEARC.

[17]  M Joan Saary,et al.  Radar plots: a useful way for presenting multivariate health care data. , 2008, Journal of clinical epidemiology.

[18]  Robert L. Grossman,et al.  High-Dimensional Visual Analytics: Interactive Exploration Guided by Pairwise Views of Point Distributions , 2006, IEEE Transactions on Visualization and Computer Graphics.

[19]  Guangjie Han,et al.  Characteristics of Co-Allocated Online Services and Batch Jobs in Internet Data Centers: A Case Study From Alibaba Cloud , 2019, IEEE Access.

[20]  Kannan Govindarajan,et al.  CLOUDRB: A framework for scheduling and managing High-Performance Computing (HPC) applications in science cloud , 2014, Future Gener. Comput. Syst..

[21]  Nawwaf N. Kharma,et al.  A high performance algorithm for static task scheduling in heterogeneous distributed computing systems , 2008, J. Parallel Distributed Comput..

[22]  Informatika Distributed Management Task Force , 2010 .

[23]  Tommy Dang Visualizing Multidimensional Health Status of Data Centers , 2018, ESPT/VPA@SC.

[24]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[25]  David Carasso,et al.  Exploring Splunk , 2012 .

[26]  Leland Wilkinson,et al.  ScagExplorer: Exploring Scatterplots by Their Scagnostics , 2014, 2014 IEEE Pacific Visualization Symposium.

[27]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[28]  Jeffrey Heer,et al.  D³ Data-Driven Documents , 2011, IEEE Transactions on Visualization and Computer Graphics.

[29]  Daniel A. Keim,et al.  Information Visualization : Scope, Techniques and Opportunities for Geovisualization , 2004 .

[30]  Angus Graeme Forbes,et al.  TimeArcs: Visualizing Fluctuations in Dynamic Networks , 2016, Comput. Graph. Forum.

[31]  Leland Wilkinson,et al.  Visualizing Big Data Outliers Through Distributed Aggregation , 2018, IEEE Transactions on Visualization and Computer Graphics.