Fuzzy pattern tree approach for mining frequent patterns from gene expression data

Frequent pattern mining has been a focused theme in data mining research for over a decade. A lot of literature has been dedicated to this research and huge amount of work has been made, ranging from efficient and scalable algorithms for frequent item set mining in transaction databases to numerous research frontiers. Frequent pattern mining (FPM) has been applied successfully in business and scientific data for discovering interesting association patterns, and is becoming a promising strategy in microarray gene expression analysis. As we know, Fuzzy logic provides a mathematical framework that is compatible with poorly quantitative yet qualitatively significant data. In this paper, we have fuzzified our original dataset and have applied various frequent pattern mining techniques to discover meaningful frequent patterns. Also, we have drawn a clear comparison of the frequent pattern mining techniques in the original and the fuzzified data in terms of parameters like runtime of the algorithm and the number of frequent patterns generated. As a result, it was found that the fuzzified set is capable of discovering a large number of frequent patterns and have a better running time capability.

[1]  Hongjun Lu,et al.  From path tree to frequent patterns: a framework for mining frequent patterns , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[2]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[3]  John F. Roddick,et al.  What's interesting about Cricket?: on thresholds and anticipation in discovered rules , 2001, SKDD.

[4]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[5]  Ulrich Güntzer,et al.  Is pushing constraints deeply into the mining algorithms really what we want?: an alternative approach for association rule mining , 2002, SKDD.

[6]  R. Altman,et al.  Whole-genome expression analysis: challenges beyond clustering. , 2001, Current opinion in structural biology.

[7]  Eric D Wieben,et al.  Primer on medical genomics. Part III: Microarray experiments and data analysis. , 2002, Mayo Clinic proceedings.

[8]  Xindong Wu,et al.  Computing the minimum-support for mining frequent patterns , 2008, Knowledge and Information Systems.

[9]  Ke Wang,et al.  Pushing Support Constraints Into Association Rules Mining , 2003, IEEE Trans. Knowl. Data Eng..

[10]  Edith Cohen,et al.  Finding interesting associations without support pruning , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[11]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[12]  Eyke Hüllermeier,et al.  A systematic approach to the assessment of fuzzy association rules , 2006, Data Mining and Knowledge Discovery.

[13]  Mohammed J. Zaki,et al.  Theoretical Foundations of Association Rules , 2007 .