Performance problem prediction in transaction-based e-business systems

Key areas in managing e-commerce systems are problem prediction, root cause analysis, and automated problem remediation. Anticipating SLO violations by proactive problem determination (PD) is particularly important since it can significantly lower the business impact of application performance problems. The main contribution of this paper is to investigate proactive PD based on two important concepts: dependency graphs and dynamic runtime performance characteristics of resources that comprise an I/T environment. The authors show how one can calculate and use the contribution of all supporting resources for a transaction to the end-to-end SLO for that transaction. Higher order moments of these components' contributions are further tracked for proactive alerting. An important aspect of this process is the classification of user transactions based on the profile of their resource usage, enabling one to set appropriate thresholds for the different classes only. Combined with the complete or semi-complete dependency information, our approach confines the scope of potential root causes to a small set of components, thus enabling efficient performance problem anticipation and quick remediation.

[1]  Michèle Basseville,et al.  Detection of abrupt changes: theory and application , 1993 .

[2]  Yixin Diao Stochastic Modeling of Lotus Notes with a Queueing Model , 2001, Int. CMG Conference.

[3]  Dong Lin,et al.  IP packet generation: statistical models for TCP start times based on connection-rate superposition , 2000, SIGMETRICS '00.

[4]  Manish Gupta,et al.  Discovering Dynamic Dependencies in Enterprise Environments for Problem Determination , 2003, DSOM.

[5]  Chuanyi Ji,et al.  Proactive network fault detection , 1997, Proceedings of INFOCOM '97.

[6]  Srinivas Ramanathan,et al.  Auto-Discovery Capabilities for Service Management: An ISP Case Study , 2004, Journal of Network and Systems Management.

[7]  Aaron B. Brown,et al.  An active approach to characterizing dynamic dependencies for problem determination in a distributed environment , 2001, 2001 IEEE/IFIP International Symposium on Integrated Network Management Proceedings. Integrated Network Management VII. Integrated Management Strategies for the New Millennium (Cat. No.01EX470).

[8]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[9]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[10]  Malgorzata Steinder,et al.  The present and future of event correlation: A need for end-to-end service fault localization , 2001 .

[11]  Manish Gupta,et al.  Mining activity data for dynamic dependency discovery in e-business systems , 2004, IEEE Transactions on Network and Service Management.

[12]  Pearson ’ s correlation , 2022 .

[13]  Malgorzata Steinder,et al.  Probabilistic event-driven fault diagnosis through incremental hypothesis updating , 2003, IFIP/IEEE Eighth International Symposium on Integrated Network Management, 2003..

[14]  Moisés Goldszmidt,et al.  On the quantification of e-business capacity , 2001, EC '01.

[15]  Uri Blumenthal,et al.  Classification and computation of dependencies for distributed management , 2000, Proceedings ISCC 2000. Fifth IEEE Symposium on Computers and Communications.

[16]  Jeffrey S. Chase,et al.  Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control , 2004, OSDI.

[17]  Daniel A. Menascé,et al.  Two-level iterative queuing modeling of software contention , 2002, Proceedings. 10th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems.

[18]  Jie Gao,et al.  Approaches to building self healing systems using dependency analysis , 2004, 2004 IEEE/IFIP Network Operations and Management Symposium (IEEE Cat. No.04CH37507).

[19]  Louis P. Slothouber,et al.  A Model of Web Server Performance , 1996 .

[20]  Donna N. Dillenberger,et al.  Adaptive Algorithms for Managing a Distributed Data Processing Workload , 1997, IBM Syst. J..

[21]  Walter Willinger,et al.  On the Self-Similar Nature of Ethernet Traffic ( extended version ) , 1995 .

[22]  Saurabh Bagchi,et al.  Dependency Analysis in Distributed Systems using Fault Injection: Application to Problem Determination in an e-commerce Environment , 2001, DSOM.

[23]  Marina Thottan,et al.  Anomaly detection in IP networks , 2003, IEEE Trans. Signal Process..

[24]  Kang G. Shin,et al.  Detecting SYN flooding attacks , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[25]  Moisés Goldszmidt,et al.  Short term performance forecasting in enterprise systems , 2005, KDD '05.