RANWAR: Rank-Based Weighted Association Rule Mining From Gene Expression and Methylation Data

Ranking of association rules is currently an interesting topic in data mining and bioinformatics. The huge number of evolved rules of items (or, genes) by association rule mining (ARM) algorithms makes confusion to the decision maker. In this article, we propose a weighted rule-mining technique (say, RANWAR or rank-based weighted association rule-mining) to rank the rules using two novel rule-interestingness measures, viz., rank-based weighted condensed support (wcs) and weighted condensed confidence (wcc) measures to bypass the problem. These measures are basically depended on the rank of items (genes). Using the rank, we assign weight to each item. RANWAR generates much less number of frequent itemsets than the state-of-the-art association rule mining algorithms. Thus, it saves time of execution of the algorithm. We run RANWAR on gene expression and methylation datasets. The genes of the top rules are biologically validated by Gene Ontologies (GOs) and KEGG pathway analyses. Many top ranked rules extracted from RANWAR that hold poor ranks in traditional Apriori, are highly biologically significant to the related diseases. Finally, the top rules evolved from RANWAR, that are not in Apriori, are reported.

[1]  Sidney Viana,et al.  Matrix Apriori: Speeding Up the Search for Frequent Patterns , 2006, Databases and Applications.

[2]  Damla Oguz,et al.  Incremental Itemset Mining Based on Matrix Apriori Algorithm , 2012, DaWaK.

[3]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[4]  Fionn Murtagh,et al.  Weighted Association Rule Mining using weighted support and significance framework , 2003, KDD '03.

[5]  J. Liu,et al.  Identifying differentially expressed genes and pathways in two types of non-small cell lung cancer: adenocarcinoma and squamous cell carcinoma. , 2014, Genetics and molecular research : GMR.

[6]  John J. Leggett,et al.  WIP: mining Weighted Interesting Patterns with a strong weight and/or support affinity , 2006, SDM.

[7]  Young-Koo Lee,et al.  Mining Weighted Frequent Patterns in Incremental Databases , 2008, PRICAI.

[8]  M. Anandhavalli,et al.  Association Rule Mining in Genomics , 2010 .

[9]  C. Akerman,et al.  The potassium–chloride cotransporter 2 promotes cervical cancer cell migration and invasion by an ion transport‐independent mechanism , 2011, The Journal of physiology.

[10]  Ke Sun,et al.  Mining Weighted Association Rules without Preassigned Weights , 2008, IEEE Transactions on Knowledge and Data Engineering.

[11]  Gordon K Smyth,et al.  Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2004, Statistical applications in genetics and molecular biology.

[12]  Tzung-Pei Hong,et al.  Incrementally fast updated frequent pattern trees , 2008, Expert Syst. Appl..

[13]  Kun-Ming Yu,et al.  A weighted load-balancing parallel Apriori algorithm for association rule mining , 2008, 2008 IEEE International Conference on Granular Computing.

[14]  C. Kumar-Sinha,et al.  Expression profiling of cervical cancers in Indian women at different stages to identify gene signatures during progression of the disease , 2013, Cancer medicine.

[15]  M Anandhavalli,et al.  Fast Association Rule Mining Algorithm for Spatial Gene Expression Data , 2010 .

[16]  Ujjwal Maulik,et al.  Integrated analysis of gene expression and genome-wide DNA methylation for tumor prediction: An association rule mining-based approach , 2013, 2013 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[17]  Anirban Mukhopadhyay,et al.  A Survey and Comparative Study of Statistical Tests for Identifying Differential Expression from Microarray Data , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Sanghamitra Bandyopadhyay,et al.  Analysis of Biological Data: A Soft Computing Approach , 2007, Science, Engineering, and Biology Informatics.

[19]  Salvatore Orlando,et al.  Enhancing the Apriori Algorithm for Frequent Set Counting , 2001, DaWaK.

[20]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .