Automatic Fine-Grained Transaction Categorization for Multi-tier Applications

Multi-tier architecture has become the industry standard for building Web applications. These applications feature in multiple categories of transactions. Properly categorizing the transactions and accurately characterizing the resource usage for each category is crucial for modeling the performance of multi-tier application. Existing studies either ignores the transaction categorization problem by simply using the URL path as the identifier of category, or require complex monitoring infrastructure. In this paper we propose a method called Transaction ICA, which automatically categorize transactions based on only widely available Web access log and aggregate resource utilization data. The method use URL path as the initial categorization setting, and iteratively split and merge categories based on estimated resource usage. The method incorporates regression based resource usage estimation technique and independent component analysis based request categorization technique. We validate the feasibility of our method using a synthetic 2-tier Web application. The experiments shows the method can correctly categorize transactions into coherent groups and give accurate per category resource demand, the result categorization is also more fine-grained than the one from existing method.

[1]  E. Oja,et al.  Independent Component Analysis , 2001 .

[2]  Richard Mortier,et al.  Using Magpie for Request Extraction and Workload Modelling , 2004, OSDI.

[3]  Mark S. Squillante,et al.  Workload service requirements analysis: a queueing network optimization approach , 2002, Proceedings. 10th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems.

[4]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[5]  Qi Zhang,et al.  A regression-based analytic model for capacity planning of multi-tier applications , 2008, Cluster Computing.

[6]  David A. Patterson,et al.  Path-Based Failure and Evolution Management , 2004, NSDI.

[7]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[8]  Alexander Kraskov,et al.  Monte Carlo Algorithm for Least Dependent Non-Negative Mixture Decomposition , 2006, Analytical chemistry.

[9]  Jerome A. Rolia,et al.  Correlating resource demand information with ARM data for application services , 1998, WOSP '98.

[10]  Jerome A. Rolia,et al.  Characterizing the scalability of a large web-based shopping system , 2001, ACM Trans. Internet Techn..

[11]  Ramesh Govindan,et al.  Automatic request categorization in internet services , 2008, PERV.

[12]  Moisés Goldszmidt,et al.  On the quantification of e-business capacity , 2001, EC '01.

[13]  Alexander Kraskov,et al.  Spectral Mixture Decomposition by Least Dependent Component Analysis , 2004, ArXiv.

[14]  Satish K. Tripathi,et al.  Single-class bounds of multi-class queuing networks , 1992, JACM.

[15]  Christopher Stewart,et al.  Exploiting nonstationarity for performance prediction , 2007, EuroSys '07.

[16]  Virgílio A. F. Almeida,et al.  Performance by Design - Computer Capacity Planning By Example , 2004 .

[17]  Michael Zibulevsky,et al.  Underdetermined blind source separation using sparse representations , 2001, Signal Process..

[18]  Wady Naanaa,et al.  Blind source separation of positive and partially correlated data , 2005, Signal Process..

[19]  Roberto Turrin,et al.  Robust Workload Estimation in Queueing Network Performance Models , 2008, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008).