Association Rules Mining for Urdu Language Using Transaction Hash Tables based Apriori (THT-Apriori)

This paper explains that how Association rules can play an important rule to automate Urdu language i.e. to create thesaurus, to mine Urdu text on web, to provide adaptive tools that can use in printing and publishing areas. To extract strong associations from Urdu text Apriori algorithm is used, but it is not worked well for Urdu text so a new Association Rules Mining (ARM) algorithm named as Transaction Hash Table Apriori (THT-Apriori) is proposed. Both algorithms are tested on different Urdu corpuses and results has shown that THT-Apriori is better than Apriori in both aspects i.e. time and number of association rules.

[1]  N.A. Ismail,et al.  Mining arabic text using soft-matching association rules , 2007, 2007 International Conference on Computer Engineering & Systems.

[2]  Wu Gongxing,et al.  A Study on the Mining Algorithm of Fast Association Rules for the XML Data , 2008, 2008 International Conference on Computer Science and Information Technology.

[3]  Yong-le Sun,et al.  Research of Word Sense Disambiguation Based on Mining Association Rule , 2009, 2009 Third International Symposium on Intelligent Information Technology Application Workshops.

[4]  Doug Won Choi,et al.  Transitive Association Rule Discovery by Considering Strategic Importance , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[5]  Soon Myoung Chung,et al.  Parallel mining of association rules from text databases on a cluster of workstations , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[6]  Fernando Berzal,et al.  Data mining: concepts and techniques by Jiawei Han and Micheline Kamber , 2002, SGMD.

[7]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[8]  Usman Qamar,et al.  Association Rules Mining for Urdu Language , 2012 .

[9]  J. Zhou Discovering association rules in engineering document , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.