ASTRA - A Novel interest measure for unearthing latent temporal associations and trends through extending basic gaussian membership function

Time profiled association mining is one of the important and challenging research problems that is relatively less addressed. Time profiled association mining has two main challenges that must be addressed. These include addressing i) dissimilarity measure that also holds monotonicity property and can efficiently prune itemset associations ii) approaches for estimating prevalence values of itemset associations over time. The pioneering research that addressed time profiled association mining is by J.S. Yoo using Euclidean distance. It is widely known fact that this distance measure suffers from high dimensionality. Given a time stamped transaction database, time profiled association mining refers to the discovery of underlying and hidden time profiled itemset associations whose true prevalence variations are similar as the user query sequence under subset constraints that include i) allowable dissimilarity value ii) a reference query time sequence iii) dissimilarity function that can find degree of similarity between a temporal itemset and reference. In this paper, we propose a novel dissimilarity measure whose design is a function of product based gaussian membership function through extending the similarity function proposed in our earlier research (G-Spamine). Our approach, MASTER (Mining of Similar Temporal Associations) which is primarily inspired from SPAMINE uses the dissimilarity measure proposed in this paper and support bound estimation approach proposed in our earlier research. Expression for computation of distance bounds of temporal patterns are designed considering the proposed measure and support estimation approach. Experiments are performed by considering naïve, sequential, Spamine and G-Spamine approaches under various test case considerations that study the scalability and computational performance of the proposed approach. Experimental results prove the scalability and efficiency of the proposed approach. The correctness and completeness of proposed approach is also proved analytically.

[1]  G. Narsimha,et al.  Design of novel fuzzy distribution function for dimensionality reduction and intrusion detection , 2016, 2016 International Conference on Engineering & MIS (ICEMIS).

[2]  Shashi Shekhar,et al.  Mining Temporal Association Patterns under a Similarity Constraint , 2008, SSDBM.

[3]  Wan-Jui Lee,et al.  Discovery of fuzzy temporal association rules , 2004, IEEE Trans. Syst. Man Cybern. Part B.

[4]  Shadi Aljawarneh,et al.  Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model , 2017, J. Comput. Sci..

[5]  Edith Cohen,et al.  Finding interesting associations without support pruning , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[6]  Sushil Jajodia,et al.  Discovering calendar-based temporal association rules , 2001, Proceedings Eighth International Symposium on Temporal Representation and Reasoning. TIME 2001.

[7]  Cheng Yang,et al.  Efficient discovery of error-tolerant frequent itemsets in high dimensions , 2001, KDD '01.

[8]  Kim-Kwang Raymond Choo,et al.  A novel fuzzy gaussian-based dissimilarity measure for discovering similarity temporal association patterns , 2018, Soft Comput..

[9]  Vangipuram RADHAKRISHNA,et al.  Normal Distribution Based Similarity Profiled Temporal Association Pattern Mining (N-SPAMINE) , 2017 .

[10]  Gunupudi Rajesh Kumar,et al.  An improved k-Means Clustering algorithm for Intrusion Detection using Gaussian function , 2015 .

[11]  Shashi Shekhar,et al.  Mining Time-Profiled Associations: An Extended Abstract , 2005, PAKDD.

[12]  Shadi Aljawarneh,et al.  A new agent approach for recognizing research trends in wearable systems , 2017, Comput. Electr. Eng..

[13]  Shie-Jue Lee,et al.  A Similarity Measure for Text Classification and Clustering , 2014, IEEE Transactions on Knowledge and Data Engineering.

[14]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[15]  Vangipuram Radhakrishna,et al.  A Computationally Efficient Approach for Mining Similar Temporal Patterns , 2016 .

[16]  Tzung-Pei Hong,et al.  Mining fuzzy temporal association rules by item lifespans , 2016, Appl. Soft Comput..

[17]  Anthony K. H. Tung,et al.  Constraint-based clustering in large databases , 2001, ICDT.

[18]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[19]  Fei Wu,et al.  Knowledge discovery in time-series databases , 2001 .

[20]  Shadi Aljawarneh,et al.  G-SPAMINE: An approach to discover temporal association patterns and trends in internet of things , 2017, Future Gener. Comput. Syst..

[21]  Vangipuram Radhakrishna,et al.  Estimating Prevalence Bounds of Temporal Association Patterns to Discover Temporally Similar Patterns , 2016 .

[22]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[23]  Ming-Syan Chen,et al.  Progressive Partition Miner: An Efficient Algorithm for Mining General Temporal Association Rules , 2003, IEEE Trans. Knowl. Data Eng..

[24]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[25]  Xiaodong Chen,et al.  Discovering Temporal Association Rules: Algorithms, Language and System , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[26]  Shashi Shekhar,et al.  Similarity-Profiled Temporal Association Mining , 2009, IEEE Transactions on Knowledge and Data Engineering.

[27]  Andrew K. C. Wong,et al.  Discovery of Temporal Associations in Multivariate Time Series , 2014, IEEE Transactions on Knowledge and Data Engineering.

[28]  Ming-Syan Chen,et al.  Sliding-window filtering: an efficient algorithm for incremental mining , 2001, CIKM '01.

[29]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[30]  Carson K. Leung,et al.  A new framework for mining weighted periodic patterns in time series databases , 2017, Expert Syst. Appl..

[31]  Vangipuram Radhakrishna,et al.  Looking into the possibility of novel dissimilarity measure to discover similarity profiled temporal association patterns in IoT , 2016, 2016 International Conference on Engineering & MIS (ICEMIS).

[32]  Ahmad Abdollahzadeh Barforoush,et al.  Efficient colossal pattern mining in high dimensional datasets , 2012, Knowl. Based Syst..

[33]  Sushil Jajodia,et al.  Mining Temporal Relationships with Multiple Granularities in Time Sequences , 1998, IEEE Data Eng. Bull..

[34]  Toon Calders,et al.  Axiomatization of frequent itemsets , 2003, Theor. Comput. Sci..

[35]  Vangipuram Radhakrishna,et al.  A Novel Approach for Mining Similarity Profiled Temporal Association Patterns Using Venn Diagrams , 2015, ArXiv.

[36]  Shadi Aljawarneh,et al.  A similarity measure for outlier detection in timestamped temporal databases , 2016, 2016 International Conference on Engineering & MIS (ICEMIS).

[37]  Gugulothu Narsimha,et al.  CLAPP: A self constructing feature clustering approach for anomaly detection , 2017, Future Gener. Comput. Syst..

[38]  Keun Ho Ryu,et al.  Mining temporal interval relational rules from temporal data , 2009, J. Syst. Softw..

[39]  Shadi Aljawarneh,et al.  Investigations of automatic methods for detecting the polymorphic worms signatures , 2016, Future Gener. Comput. Syst..

[40]  Chia-Wen Chang,et al.  Fast discovery of sequential patterns in large databases using effective time-indexing , 2008, Inf. Sci..

[41]  Gugulothu Narsimha,et al.  An Approach for Intrusion Detection Using Novel Gaussian Based Kernel Function , 2016, J. Univers. Comput. Sci..

[42]  Roque Marín,et al.  A tree structure for event-based sequence mining , 2012, Knowl. Based Syst..

[43]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[44]  Philip S. Yu,et al.  Mining association rules with adjustable accuracy , 1997, CIKM '97.

[45]  Vangipuram Radhakrishna,et al.  A Survey on Temporal Databases and Data mining , 2015 .

[46]  Shadi Aljawarneh,et al.  A computationally efficient approach for temporal pattern mining in IoT , 2016, 2016 International Conference on Engineering & MIS (ICEMIS).

[47]  V. Radhakrishna,et al.  Estimating temporal pattern bounds using negative support computations , 2016, 2016 International Conference on Engineering & MIS (ICEMIS).

[48]  Susan P. Imberman,et al.  Discovery of Association Rules in Temporal Databases , 2007, Fourth International Conference on Information Technology (ITNG'07).

[49]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[50]  Philip S. Yu,et al.  Mining Colossal Frequent Patterns by Core Pattern Fusion , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[51]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[52]  Vangipuram Radhakrishna,et al.  A computationally optimal approach for extracting similar temporal patterns , 2016, 2016 International Conference on Engineering & MIS (ICEMIS).

[53]  Kien A. Hua,et al.  Mining Interval Time Series , 1999, DaWaK.

[54]  Asif Imran,et al.  Web Data Amalgamation for Security Engineering: Digital Forensic Investigation of Open Source Cloud , 2016, J. Univers. Comput. Sci..

[55]  Suh-Yin Lee,et al.  Fast Discovery of Sequential Patterns by Memory Indexing , 2002, DaWaK.

[56]  Heikki Mannila,et al.  Discovering Frequent Episodes in Sequences , 1995, KDD.

[57]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[58]  Sridhar Ramaswamy,et al.  On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.

[59]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[60]  Suh-Yin Lee,et al.  Mining Temporal Patterns in Time Interval-Based Data , 2015, IEEE Transactions on Knowledge and Data Engineering.

[61]  William G. Marchal,et al.  Statistical techniques in business and economics , 1991 .

[62]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[63]  Shadi Aljawarneh,et al.  A novel fuzzy similarity measure and prevalence estimation approach for similarity profiled temporal association pattern mining , 2017, Future Gener. Comput. Syst..

[64]  John F. Roddick,et al.  ARMADA - An algorithm for discovering richer relative temporal association rules from interval-based data , 2007, Data Knowl. Eng..

[65]  Roque Marín,et al.  Mining generalized temporal patterns based on fuzzy counting , 2013, Expert Syst. Appl..

[66]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[67]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[68]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[69]  Sushil Jajodia,et al.  Testing complex temporal relationships involving multiple granularities and its application to data mining (extended abstract) , 1996, PODS.

[70]  Vangipuram Radhakrishna,et al.  A Novel Similar Temporal System Call Pattern Mining for Efficient Intrusion Detection , 2016, J. Univers. Comput. Sci..

[71]  Gustavo Rossi,et al.  An approach to discovering temporal association rules , 2000, SAC '00.

[72]  Shie-Jue Lee,et al.  A Fuzzy Self-Constructing Feature Clustering Algorithm for Text Classification , 2011, IEEE Transactions on Knowledge and Data Engineering.

[73]  Shikha Gupta,et al.  Mining Frequent Closed Itemsets for Association Rules , 2009 .

[74]  Christian Borgelt,et al.  Keeping things simple: finding frequent item sets by recursive elimination , 2005 .

[75]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[76]  Vangipuram Radhakrishna,et al.  Mining of outlier temporal patterns , 2016, 2016 International Conference on Engineering & MIS (ICEMIS).

[77]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[78]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[79]  Vangipuram Radhakrishna,et al.  An Approach for Mining Similar Temporal Association Patterns in Single Database Scan , 2016 .

[80]  Gösta Grahne,et al.  Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[81]  Ajith Abraham,et al.  An efficient algorithm for incremental mining of temporal association rules , 2010, Data Knowl. Eng..

[82]  Wan-Jui Lee,et al.  An efficient algorithm to discover calendar-based temporal association rules , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[83]  J. S. Yoo Temporal Data Mining: Similarity-Profiled Association Pattern , 2012 .

[84]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules: Design, Implementation and Experience , 1999 .

[85]  Shadi A. Aljawarneh,et al.  A similarity measure for temporal pattern discovery in time series data generated by IoT , 2016, 2016 International Conference on Engineering & MIS (ICEMIS).

[86]  X.S. Wang,et al.  Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences , 1998, IEEE Trans. Knowl. Data Eng..