Analytical modelling and optimization analysis of large-scale communication systems and networks with repairmen policy

The large-scale communication systems and computer networks provide flexible, efficient, and highly available services to their users. However, the practical large-scale systems result in unpredictable, fault-tolerant, often detrimental outcomes. This leads to developing and designing analytical models to understand and predict of complex system behaviour in order to ensure availability of large-scale systems. In this paper, analytical modelling and optimization analysis are presented for large-scale systems. The key contribution of this paper is twofold. First, a generic approximate solution approach is adapted and developed for performability modelling which considers performance and availability issues of large number of nodes with multi-repairmen. The analytical model and solution presented here are capable of considering large number of nodes up to thousands and able to incorporate availability issues of the system. Second and foremost, the relationship between the number of nodes and the number of repairmen is presented with an optimization analysis for large-scale systems. In order to show the efficacy and the accuracy of the proposed approach, the results obtained from the analytical model is validated with the results obtained from the simulations.

[1]  Marco Conti,et al.  Performance modelling of opportunistic forwarding under heterogenous mobility , 2014, Comput. Commun..

[2]  Mark S. Squillante,et al.  Failure data analysis of a large-scale heterogeneous server environment , 2004, International Conference on Dependable Systems and Networks, 2004.

[3]  Enver Ever,et al.  A hybrid approach to minimize state space explosion problem for the solution of two stage tandem queues , 2013, J. Netw. Comput. Appl..

[4]  Doaa M. Shawky,et al.  Scalable Approach to Failure Analysis of High‐Performance Computing Systems , 2014 .

[5]  Albert Y. Zomaya,et al.  Customer-Satisfaction-Aware Optimal Multiserver Configuration for Profit Maximization in Cloud Computing , 2019 .

[6]  Enver Ever,et al.  Fault-Tolerant Two-Stage Open Queuing Systems With Server Failures at Both Stages , 2014, IEEE Communications Letters.

[7]  Mirco Tribastone,et al.  Tackling continuous state-space explosion in a Markovian process algebra , 2014, Theor. Comput. Sci..

[8]  Chi Guo,et al.  A Capacity Optimization Algorithm for Network Survivability Enhancement , 2009, 2009 International Conference on Multimedia Information Networking and Security.

[9]  Jordi Vilaplana,et al.  A queuing theory model for cloud computing , 2014, The Journal of Supercomputing.

[10]  Bianca Schroeder,et al.  A Large-Scale Study of Failures in High-Performance Computing Systems , 2010, IEEE Trans. Dependable Secur. Comput..

[11]  Adam Wierman,et al.  Multi-Server Queueing Systems with Multiple Priority Classes , 2005, Queueing Syst. Theory Appl..

[12]  Christopher E. Dabrowski,et al.  Markov Chain Analysis for Large-Scale Grid Systems | NIST , 2009 .

[13]  Khin Mi Mi Aung,et al.  Building a large-scale object-based active storage platform for data analytics in the internet of things , 2016, The Journal of Supercomputing.

[14]  Philip M. Papadopoulos Extending clusters to Amazon EC2 using the Rocks toolkit , 2011, Int. J. High Perform. Comput. Appl..

[15]  A. Melikov,et al.  Hierarchical Space Merging Algorithm for the Analysis of Open Tandem Queueing Networks , 2016 .

[16]  Ram Chakka,et al.  Performance and reliability modelling of computing systems using spectral expansion , 1995 .

[17]  Kwanghoon Pio Kim,et al.  An Estimated Closeness Centrality Ranking Algorithm and Its Performance Analysis in Large-Scale Workflow-supported Social Networks , 2016, KSII Trans. Internet Inf. Syst..

[18]  Kishor S. Trivedi,et al.  Performability Analysis: Measures, an Algorithm, and a Case Study , 1988, IEEE Trans. Computers.

[19]  Raymond A. Marie,et al.  Performability Modelling : Techniques and Tools , 2001 .

[20]  Xiang Li,et al.  A Novel Method for Survivability Test Based on End Nodes in Large Scale Network , 2015, KSII Trans. Internet Inf. Syst..

[21]  Enver Ever,et al.  Modelling and analysis of vertical handover in highly mobile environments , 2015, The Journal of Supercomputing.

[22]  Dug Hee Moon,et al.  Approximation of throughput in tandem queues with multiple servers and blocking , 2014 .

[23]  Ram Chakka,et al.  Spectral Expansion Solution for a Class of Markov Models: Application and Comparison with the Matrix-Geometric Method , 1995, Perform. Evaluation.

[24]  J. Banks,et al.  Discrete-Event System Simulation , 1995 .

[25]  Enver Ever,et al.  Performability Modelling of Handoff in Wireless Cellular Networks and the Exact Solution of System Models with Service Rates Dependent on Numbers of Originating and Handoff Calls , 2009, 2009 International Conference on Computational Intelligence, Modelling and Simulation.

[26]  Azzedine Boukerche,et al.  The impact of mobility on Mobile Ad Hoc Networks through the perspective of complex networks , 2011, J. Parallel Distributed Comput..

[27]  Dario Bruneo,et al.  A Stochastic Model to Investigate Data Center Performance and QoS in IaaS Cloud Computing Systems , 2014, IEEE Transactions on Parallel and Distributed Systems.

[28]  Enver Ever,et al.  Performability analysis of cloud computing centers with large numbers of servers , 2017, The Journal of Supercomputing.

[29]  Yue Zhang,et al.  A Hierarchical Model for Mobile Ad Hoc Network Performability Assessment , 2016, KSII Trans. Internet Inf. Syst..

[30]  Kishor S. Trivedi,et al.  Composite performance and availability analysis of wireless communication networks , 2001, IEEE Trans. Veh. Technol..

[31]  Enver Ever,et al.  Analytical Modelling and Performability Evaluation of Multi-Channel WLANs with Global Failures , 2015, Int. J. Comput. Commun. Control.

[32]  Kishor S. Trivedi,et al.  Performability Evaluation of Grid Environments Using Stochastic Reward Nets , 2015, IEEE Transactions on Dependable and Secure Computing.

[33]  Ram Chakka,et al.  Modelling multiserver systems with time or operation dependent breakdowns, alternate repair strategies, reconfiguration and rebooting delays , 2002 .