An efficient online system of concept based association rules mining

This paper presents a new text mining system for extracting association rules based on concepts from online textual documents. The system is called developed extracting association rules from textual documents. The mathematical formula of weighting schema is used for labeling the documents automatically and its named fuzzy weighting schema. A new algorithm is proposed for generating association rules based on concepts and it used a data structure of hash table for the mining process. The experiments are applied on a collection of scientific documents that selected from MEDLINE for breast cancer treatments and side effects. The performance of proposed system is compared with the previous Apriori-concept system for the execution time and the evaluation of the extracted association rules. The results show that the number of extracted association rules in the proposed system is always less than that in Apriori-concept system. Moreover, the execution time of proposed system is much better than Apriori-concept system in all cases.

[1]  Yonatan Aumann,et al.  Maximal Association Rules: A Tool for Mining Associations in Text , 2005, Journal of Intelligent Information Systems.

[2]  Gillian Dobbie,et al.  Extracting association rules from XML documents using XQuery , 2003, WIDM '03.

[3]  Javed Mostafa,et al.  Concept extraction and association from cancer literature , 2002, WIDM '02.

[4]  George Buchanan,et al.  Scalable browsing for large collections: a case study , 2000, DL '00.

[5]  Simona Balbi,et al.  A Text Mining Strategy based on Local Contexts of Words , 2004 .

[6]  Alessandro Campi,et al.  Mining Association Rules from XML Data , 2002, DaWaK.

[7]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[8]  Norberto F. Ezquerra,et al.  Mining constrained association rules to predict heart disease , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[9]  Wei Jin,et al.  HCAMiner: Mining Concept Associations for Knowledge Discovery through Concept Chain Queries , 2007, COLING.

[10]  Tong Zhang,et al.  Text Mining: Predictive Methods for Analyzing Unstructured Information , 2004 .

[11]  Shamkant B. Navathe,et al.  Text analysis of MEDLINE for discovering functional relationships among genes: evaluation of keyword extraction weighting schemes , 2006, Int. J. Data Min. Bioinform..

[12]  Mika Klemettinen,et al.  Applying data mining techniques for descriptive phrase extraction in digital document collections , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[13]  Tze-Yun Leong,et al.  Automatic model structuring from text using biomedical ontology , 2004 .

[14]  Juryon Paik,et al.  Mining Association Rules from a Collection of XML Documents using Cross Filtering Algorithm , 2006, 2006 International Conference on Hybrid Information Technology.

[15]  Haym Hirsh,et al.  Mining Associations in Text in the Presence of Background Knowledge , 1996, KDD.

[16]  Hee Yong Youn,et al.  A New Method for Mining Association Rules from a Collection of XML Documents , 2005, ICCSA.

[17]  Pier Luca Lanzi,et al.  A tool for extracting XML association rules , 2002, 14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings..

[18]  Qin Ding,et al.  Deriving General Association Rules from XML Data , 2003, SNPD.

[19]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[20]  Mathieu Roche,et al.  Mining texts by association rules discovery in a technical corpus , 2004, Intelligent Information Systems.

[21]  Ido Dagan,et al.  Mining Text Using Keyword Distributions , 1998, Journal of Intelligent Information Systems.

[22]  Fawzy A. Torkey,et al.  A Text Mining Technique Using Association Rules Extraction , 2008 .

[23]  Masoud Rahgozar,et al.  A New Model for Discovering XML Association Rules from XML Documents , 2008 .

[24]  Zhaoxia Wang,et al.  PCAR: An Efficient Approach for Mining Association Rules , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[25]  Tze-Yun Leong,et al.  Automated Knowledge Extraction for Decision Model Construction: A Data Mining Approach , 2003, AMIA.

[26]  Hany Mahgoub,et al.  Mining Association Rules from Unstructured Documents , 2008 .

[27]  Ido Dagan,et al.  Knowledge Discovery in Textual Databases (KDT) , 1995, KDD.

[28]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[29]  Amedeo Napoli,et al.  Towards a text mining methodology using association rule extraction , 2006, Soft Comput..

[30]  José Palazzo Moreira de Oliveira,et al.  Concept-based knowledge discovery in texts extracted from the Web , 2000, SKDD.

[31]  Ramakrishnan Srikant,et al.  Discovering Trends in Text Databases , 1997, KDD.

[32]  Johannes Fürnkranz,et al.  A Study Using $n$-gram Features for Text Categorization , 1998 .

[33]  Richi Nayak Discovering Knowledge from XML Documents , 2009, Encyclopedia of Data Warehousing and Mining.

[34]  Prasenjit Majumder,et al.  N-gram: a language independent approach to IR and NLP , 2002 .

[35]  Klaus Obermayer,et al.  A Two-Level Learning Hierarchy of Concept Based Keyword Extraction for Tag Recommendations , 2009, DC@PKDD/ECML.

[36]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.