Adaptive Performance Analysis in IoT Platforms

In this paper, we consider the problem of identifying multiple bottlenecks (a.k.a bottleneck analysis) in IoT Service Platforms. For QoS-constrained applications, IoT Platforms have grown in complexity with non-stationary workloads and inter-task dependencies created by data flows crossing the platform’s nodes. These factors create multiple simultaneous “bottlenecks” (a bottleneck expresses overload in terms of request processing time on a given node, and contributes to QoS degradation). Multi-bottlenecks are non-trivial to analyze since they may escape typical assumptions made in classic performance analysis, such as analysis based on queuing theory models. Solving this analysis problem requires real-time collection and analysis of data that can be massive, and as a result, induce negative impacts on the performance of the NFV-based IoT Platform (NIP) (e.g., use of bandwidth, computing resource, and storage resource). Therefore, it needs to be adapted to the strict minimum allowing effective analysis. We build an adaptive performance analysis method that optimizes bottlenecks’ identification for a monitoring overhead budget associated with the different available metrics. Instead of systematically collecting all the NIP metrics, the proposed process determines the best subset of metrics to consider for the efficiency of the performance analysis. The conducted experiments on a practical use case show that the proposed method exhibited high performances of the bottleneck analysis process, in the presence of different bottleneck types and durations, with very few false positives and false negatives.

[1]  Xiaohu Ge,et al.  IIoT Data Sharing Based on Blockchain: A Multileader Multifollower Stackelberg Game Approach , 2022, IEEE Internet of Things Journal.

[2]  Khalil Drira,et al.  A Cost-Effective Approach for End-to-End QoS Management in NFV-Enabled IoT Platforms , 2021, IEEE Internet of Things Journal.

[3]  Fang Liu,et al.  Unsupervised Online Anomaly Detection With Parameter Adaptation for KPI Abrupt Changes , 2020, IEEE Transactions on Network and Service Management.

[4]  Yosra Ben Slimen,et al.  Root Cause Analysis of Noisy Neighbors in a Virtualized Infrastructure , 2020, 2020 IEEE Wireless Communications and Networking Conference (WCNC).

[5]  Bjarne E. Helvik,et al.  Network-Aware Availability Modeling of an End-to-End NFV-Enabled Service , 2019, IEEE Transactions on Network and Service Management.

[6]  Lisandro Zambenedetti Granville,et al.  Guiltiness: A practical approach for quantifying virtual network functions performance , 2019, Comput. Networks.

[7]  Khalil Drira,et al.  Experimental comparison of the diagnostic capabilities of classification and clustering algorithms for the QoS management in an autonomic IoT platform , 2019, Service Oriented Computing and Applications.

[8]  Odej Kao,et al.  Unsupervised Anomaly Event Detection for VNF Service Monitoring Using Multivariate Online Arima , 2018, 2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom).

[9]  Ben Y. Zhao,et al.  Predictive Analysis in Network Function Virtualization , 2018, Internet Measurement Conference.

[10]  Yu He,et al.  Performance Anomaly Detection Models of Virtual Machines for Network Function Virtualization Infrastructure with Machine Learning , 2018, ICANN.

[11]  Domenico Cotroneo,et al.  Dependability Certification Guidelines for NFVIs through Fault Injection , 2018, 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW).

[12]  Kahina Lazri,et al.  Anomaly detection and diagnosis for cloud services: Practical experiments and lessons learned , 2018, J. Syst. Softw..

[13]  Leonardo Mariani,et al.  Localizing Faults in Cloud Systems , 2018, 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST).

[14]  Tao Wang,et al.  Self-adaptive cloud monitoring with online anomaly detection , 2018, Future Gener. Comput. Syst..

[15]  Yunchun Li,et al.  BigRoots: An Effective Approach for Root-Cause Analysis of Stragglers in Big Data System , 2018, IEEE Access.

[16]  Domenico Cotroneo,et al.  A Fault Correlation Approach to Detect Performance Anomalies in Virtual Network Function Chains , 2017, 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE).

[17]  Siobhán Clarke,et al.  Quality of service approaches in IoT: A systematic mapping , 2017, J. Syst. Softw..

[18]  Xiaorong Zhu,et al.  A Novel Virtual Network Fault Diagnosis Method Based on Long Short-Term Memory Neural Networks , 2017, 2017 IEEE 86th Vehicular Technology Conference (VTC-Fall).

[19]  Junyou Shi,et al.  Fault Propagation Reasoning and Diagnosis for Computer Networks Using Cyclic Temporal Constraint Network Model , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[20]  Domenico Cotroneo,et al.  NFV-Bench: A Dependability Benchmark for Network Function Virtualization Systems , 2017, IEEE Transactions on Network and Service Management.

[21]  Yang Yang,et al.  Root cause analysis of anomalies of multitier services in public clouds , 2017, 2017 IEEE/ACM 25th International Symposium on Quality of Service (IWQoS).

[22]  Piotr Szymanski,et al.  A scikit-based Python environment for performing multi-label classification , 2017, ArXiv.

[23]  Victor Muntés-Mulero,et al.  Survey on Models and Techniques for Root-Cause Analysis , 2017, ArXiv.

[24]  José Manuel Navarro González,et al.  Root Cause Analysis of Network Failures Using Machine Learning and Summarization Techniques , 2017, IEEE Communications Magazine.

[25]  Kahina Lazri,et al.  Anomaly Detection and Root Cause Localization in Virtual Network Functions , 2016, 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE).

[26]  Juan Felipe Botero,et al.  Resource Allocation in NFV: A Comprehensive Survey , 2016, IEEE Transactions on Network and Service Management.

[27]  Filip De Turck,et al.  Network Function Virtualization: State-of-the-Art and Research Challenges , 2015, IEEE Communications Surveys & Tutorials.

[28]  Nikola Bogunovic,et al.  A review of feature selection methods with applications , 2015, 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[29]  Jan Waller,et al.  Performance Benchmarking of Application Monitoring Frameworks , 2014, Softwaretechnik-Trends.

[30]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[31]  Brendan Gregg,et al.  Systems Performance: Enterprise and the Cloud , 2013 .

[32]  Daniel Massey,et al.  G-RCA: a generic root cause analysis platform for service quality management in large IP networks , 2012, TNET.

[33]  Daniel Massey,et al.  G-RCA: A Generic Root Cause Analysis Platform for Service Quality Management in Large IP Networks , 2010, IEEE/ACM Transactions on Networking.

[34]  Grigorios Tsoumakas,et al.  Random K-labelsets for Multilabel Classification , 2022 .

[35]  Dominic Battré,et al.  Detecting bottlenecks in parallel DAG-based data flow programs , 2010, 2010 3rd Workshop on Many-Task Computing on Grids and Supercomputers.

[36]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[37]  Calton Pu,et al.  A new perspective on experimental analysis of N-tier systems: Evaluating database scalability, multi-bottlenecks, and economical operation , 2009, 2009 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing.

[38]  Calton Pu,et al.  Experimental evaluation of N-tier systems: Observation and analysis of multi-bottlenecks , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[39]  David Sinreich,et al.  An architectural blueprint for autonomic computing , 2006 .

[40]  J. Reunanen Search Strategies , 2021, International Journal of Obesity.

[41]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[42]  Balas K. Natarajan,et al.  Sparse Approximate Solutions to Linear Systems , 1995, SIAM J. Comput..

[43]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[44]  Justin Doak,et al.  An evaluation of feature selection methods and their application to computer security , 1992 .