An Innovative Framework for Supporting Frequent Pattern Mining Problems in IoT Environments

In the current era of big data, high volumes of a wide variety of data of different veracity can be easily generated or collected at a high velocity from rich sources of data include devices from the Internet of Things (IoT). Embedded in these big data are useful information and valuable knowledge. Hence, frequent pattern mining and its related research problem of association rule mining, which aim to discover implicit, previously unknown and potentially useful information and knowledge—in the form of sets of frequently co-occurring items or rules revealing relationships between these frequent sets—from these big data have drawn attention of many researchers. For instance, since introduction of the research problems of association rule mining or frequent pattern mining, numerous information system and engineering approaches have been developed. These include the development of serial algorithms, distributed and parallel algorithms, as well as MapReduce-based big data mining algorithms. These algorithms can be run in local computers, distributed and parallel environments, as well as clusters, grids and clouds. In this paper, we describe some of these algorithms and discuss how to mine frequent patterns or association rules in fogs—i.e., edges of the computing network.

[1]  Carson Kai-Sang Leung,et al.  DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams , 2006, Sixth International Conference on Data Mining (ICDM'06).

[2]  Frank Mueller,et al.  Hybrid MPI/OpenMP programming on the Tilera manycore architecture , 2016, 2016 International Conference on High Performance Computing & Simulation (HPCS).

[3]  Mohammed J. Zaki Parallel and distributed association mining: a survey , 1999, IEEE Concurr..

[4]  Jiayi Zhou,et al.  Load Balancing Approach Parallel Algorithm for Frequent Pattern Mining , 2007, PaCT.

[5]  Carson Kai-Sang Leung,et al.  Big Data Analysis and Mining , 2019, Advances in Computer and Electrical Engineering.

[6]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[7]  Carson Kai-Sang Leung,et al.  Finding efficiencies in frequent pattern mining from big uncertain data , 2017, World Wide Web.

[8]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[9]  Bart Goethals,et al.  Frequent Itemset Mining for Big Data , 2013, 2013 IEEE International Conference on Big Data.

[10]  Jiang Zhu,et al.  Fog Computing: A Platform for Internet of Things and Analytics , 2014, Big Data and Internet of Things.

[11]  James Bailey,et al.  Mining Probabilistic Frequent Spatio-Temporal Sequential Patterns with Gap Constraints from Uncertain Databases , 2013, 2013 IEEE 13th International Conference on Data Mining.

[12]  Vali Uddin,et al.  MapReduce for multi-view object recognition , 2016, 2016 International Conference on High Performance Computing & Simulation (HPCS).

[13]  Rajkumar Buyya,et al.  Fog Computing: Helping the Internet of Things Realize Its Potential , 2016, Computer.

[14]  Ming-Yen Lin,et al.  Apriori-based frequent itemset mining algorithms on MapReduce , 2012, ICUIMC.

[15]  Alfredo Cuzzocrea,et al.  Enabling OLAP in mobile environments via intelligent data cube compression techniques , 2008, Journal of Intelligent Information Systems.

[16]  Carson Kai-Sang Leung,et al.  B-mine: Frequent Pattern Mining and Its Application to Knowledge Discovery from Social Networks , 2016, APWeb.

[17]  Ke Wang,et al.  Top Down FP-Growth for Association Rule Mining , 2002, PAKDD.

[18]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[19]  Andrea Rosà,et al.  Predicting and Mitigating Jobs Failures in Big Data Clusters , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[20]  Marcus Hardt,et al.  Identity harmonization for federated HPC, grid and cloud services , 2016, 2016 International Conference on High Performance Computing & Simulation (HPCS).

[21]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[22]  Carson Kai-Sang Leung,et al.  A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data , 2008, PAKDD.

[23]  Giancarlo Fortino,et al.  Managing Data and Processes in Cloud-Enabled Large-Scale Sensor Networks: State-of-the-Art and Future Research Directions , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[24]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[25]  Hongjun Lu,et al.  H-mine: hyper-structure mining of frequent patterns in large databases , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[26]  Xavier Martorell,et al.  In search of the best MPI-OpenMP distribution for optimum Intel-MIC cluster performance , 2015, 2015 International Conference on High Performance Computing & Simulation (HPCS).

[27]  Mario Cannataro,et al.  A Probabilistic Approach to Model Adaptive Hypermedia Systems , 2001, WebDyn@ICDT.

[28]  Kotagiri Ramamohanarao,et al.  Contrast pattern mining and its applications , 2010, ADC.

[29]  Anand Rajaraman,et al.  Mining of Massive Datasets , 2011 .

[30]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD 2000.

[31]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[32]  Carson K. Leung,et al.  FIMaaS: Scalable Frequent Itemset Mining-as-a-Service on Cloud for Non-Expert Miners , 2015, BigDAS.

[33]  Carson Kai-Sang Leung,et al.  Association Rule Mining in Collaborative Filtering , 2017 .

[34]  Carson Kai-Sang Leung,et al.  An Efficient Approach for Mining Frequent Patterns over Uncertain Data Streams , 2016, 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI).

[35]  Edward Y. Chang,et al.  Pfp: parallel fp-growth for query recommendation , 2008, RecSys '08.

[36]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[37]  Osmar R. Zaïane,et al.  Fast parallel association rule mining without candidacy generation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[38]  Alfredo Cuzzocrea,et al.  Complex Mining from Uncertain Big Data in Distributed Environments , 2017 .

[39]  Wenguang Chen,et al.  Tree partition based parallel frequent pattern mining on shared memory systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[40]  Carson Kai-Sang Leung,et al.  Mining sequential patterns from uncertain big DNA in the spark framework , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[41]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[42]  Byeong-Soo Jeong,et al.  Parallel and Distributed Frequent Pattern Mining in Large Databases , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.

[43]  Carson Kai-Sang Leung,et al.  Mining Frequent Patterns from Uncertain Data with MapReduce for Big Data Analytics , 2013, DASFAA.

[44]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[45]  Carson Kai-Sang Leung,et al.  Visually Contrast Two Collections of Frequent Patterns , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[46]  Genlin Ji,et al.  MREclat: An Algorithm for Parallel Mining Frequent Itemsets , 2013, 2013 International Conference on Advanced Cloud and Big Data.

[47]  Alfredo Cuzzocrea Accuracy Control in Compressed Multidimensional Data Cubes for Quality of Answer-based OLAP Tools , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).

[48]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[49]  Carson K. Leung,et al.  A new framework for mining weighted periodic patterns in time series databases , 2017, Expert Syst. Appl..

[50]  Carson Kai-Sang Leung,et al.  Uncertain Frequent Pattern Mining , 2014, Frequent Pattern Mining.

[51]  David S. Linthicum Connecting Fog and Cloud Computing , 2017, IEEE Cloud Computing.

[52]  Donato Malerba,et al.  A parallel algorithm for approximate frequent itemset mining using MapReduce , 2014, 2014 International Conference on High Performance Computing & Simulation (HPCS).

[53]  Rajkumar Buyya,et al.  Article in Press Future Generation Computer Systems ( ) – Future Generation Computer Systems Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility , 2022 .