Effective algorithm for frequent pattern mining

Apriori algorithm is an influential data mining algorithm which generates a list of most frequent web pages visited. Due to fast changing contents of database one needs an algorithm which is real time. The major drawback of Apriori algorithm is that, it needs to scan main database each and every time to generate frequent patterns which results in more usage of memory and execution time, hence in order to reduce execution time and usage of memory a lot of research has been conducted to improve Apriori Algorithm. Towards improving Apriori, a modified version is proposed in this paper to generate frequent patterns. This enables finding patterns rather than going back to the database at every pass. This limits the number of scan and also the number of total combinations is brought down from 2N to 2N-2. This considerably reduces memory usage as well as execution speed and makes real time pattern discovery possible.

[1]  Vipin Kumar,et al.  Scalable parallel data mining for association rules , 1997, SIGMOD '97.

[2]  Philip S. Yu,et al.  Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[3]  Anand V. Saurkar,et al.  A Review Paper on Various Data Mining Techniques , 2014 .

[4]  Umeshwar Dayal,et al.  From User Access Patterns to Dynamic Hypertext Linking , 1996, Comput. Networks.

[5]  Salvatore J. Stolfo,et al.  Distributed data mining in credit card fraud detection , 1999, IEEE Intell. Syst..

[6]  V. Mayil Web Navigation Path Pattern Prediction using First Order Markov Model and Depth first Evaluation , 2012 .

[7]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[8]  Clement T. Yu,et al.  Annotating Search Results from Web Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[9]  Jaideep Srivastava,et al.  Creating adaptive Web sites through usage-based clustering of URLs , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[10]  Tong Wang,et al.  Web Log Mining by an Improved AprioriAll Algorithm , 2005, WEC.

[11]  Dragica Radosav,et al.  Mining user access logs to optimize navigational structure of adaptive web sites , 2010, 2010 11th International Symposium on Computational Intelligence and Informatics (CINTI).

[12]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[13]  R. Krishnamoorthi,et al.  Identifying User Behavior by Analyzing Web Server Access Log File , 2009 .

[14]  Ketul M. Patel,et al.  Process of Web Usage Mining to find Interesting Patterns from Web Usage Data , 2012, BIOINFORMATICS 2012.

[15]  P. Parthiban,et al.  Big Data Architecture for Capturing, Storing, Analyzing and Visualizing of Web Server Logs , 2016 .

[16]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[17]  T. Santhanam,et al.  AN OVERVIEW OF PREPROCESSING OF WEB LOG FILES FOR WEB USAGE MINING , 2011 .

[18]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[19]  James H. Andrews,et al.  General Test Result Checking with Log File Analysis , 2003, IEEE Trans. Software Eng..

[20]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[21]  Ron Kohavi,et al.  Ten Supplementary Analyses to Improve E-commerce Web Sites , 2003 .

[22]  Keith C. C. Chan,et al.  An effective algorithm for mining interesting quantitative association rules , 1997, SAC '97.

[23]  Manoj Wadhwa,et al.  Analysis of Server Log by Web Usage Mining for Website Improvement , 2010 .

[24]  Hameetha Begum,et al.  Data Mining Tools and Trends – An Overview , 2013 .

[25]  J. Midhunchakkaravarthy,et al.  An Enhanced Web Mining Approach for Product Usability Evaluation in Feature Fatigue Analysis using LDA Model and Association Rule Mining with Fruit Fly Algorithm , 2016 .

[26]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[27]  K.R. Suneetha,et al.  Advanced Version of A Priori Algorithm , 2010, 2010 First International Conference on Integrated Intelligent Computing.

[28]  Cho-Li Wang,et al.  Dynamic Optimization of Multiattribute Resource Allocation in Self-Organizing Clouds , 2013, IEEE Transactions on Parallel and Distributed Systems.

[29]  Heikki Mannila,et al.  Discovering Generalized Episodes Using Minimal Occurrences , 1996, KDD.

[30]  Tao Luo,et al.  Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization , 2004, Data Mining and Knowledge Discovery.

[31]  Liliana Ibanescu,et al.  Fuzzy Web Data Tables Integration Guided by an Ontological and Terminological Resource , 2013, IEEE Transactions on Knowledge and Data Engineering.