论文信息 - An intelligent approach to improve the performance of a data warehouse cache based on association rules

An intelligent approach to improve the performance of a data warehouse cache based on association rules

Abstract In the world of business applications, it is significantly important to reflect patterns and trends of customers, all this to make tactical and strategic decisions. The data warehouse holds information management and turns it into meaningful management information, from which, very interesting patterns can be discovered by applying knowledge discovery process. The use of Online analytical processing with other related technologies such as data mining, can meet the needs related to business management analysis of an organization. Most analytical activities are completed remotely, and because of the huge data size of the Data Warehouse systems. So we need tools that strengthen applications to access the requested information quickly. As the update of the Data Warehouse is not too frequent, it is possible to improve query performance while storing the data retrieved by them in a cache. However, the most powerful systems have a small capacity to store the entire database in memory cache. The caching chunks technique is designed to keep in cache the query results in the form of chunks of values, instead of storing them in large tables. In this paper, we propose a new technique for caching multidimensional queries based on association rules. Using this technique will allow all users to enjoy the benefits of Data Warehousing in the best manner, and also to improve performance and also increase the use of the system while reducing the response time. The technique is build using an architecture comprising a data warehouse, a memory cache on the server and a one on each user’s machine, in which the association rules and query results are stored. These results are kept in the form of chunks to enjoy all the advantages of the technique of fragmentation into chunks. This approach has been implemented and tested over a real huge data followed by displaying the results and analyzes.

[1] Carlo Vercellis,et al. Business Intelligence: Data Mining and Optimization for Decision Making , 2009 .

[2] Dorian Pyle. Business modeling and data mining , 2003 .

[3] Bhavani Thuraisingham,et al. Design and Implementation of Data Mining Tools , 2009 .

[4] Arthur M. Keller,et al. A predicate-based caching scheme for client-server database architectures , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[5] Jeffrey F. Naughton,et al. Caching multidimensional queries using chunks , 1998, SIGMOD '98.

[6] Rakesh Agarwal,et al. Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[7] Navin Kumar,et al. Supporting mobile decision making with association rules and multi-layered caching , 2007, Decis. Support Syst..

[8] Peter Scheuermann,et al. WATCHMAN : A Data Warehouse Intelligent Cache Manager , 1996, VLDB.

[9] Divyakant Agrawal,et al. Range cube: efficient cube computation by exploiting data correlation , 2004, Proceedings. 20th International Conference on Data Engineering.

[10] Giuseppe Psaila,et al. A New SQL-like Operator for Mining Association Rules , 1996, VLDB.

[11] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[12] Daling Wang,et al. An Efficient Indexing Technique for Computing High Dimensional Data Cubes , 2006, WAIM.

[13] Jeffrey F. Naughton,et al. An array-based algorithm for simultaneous multidimensional aggregates , 1997, SIGMOD '97.

[14] Vojislav Kecman,et al. Kernel Based Algorithms for Mining Huge Data Sets: Supervised, Semi-supervised, and Unsupervised Learning , 2006, Studies in Computational Intelligence.

[15] Stephen R. Gardner. Building the data warehouse , 1998, CACM.