Big data analytics with swarm intelligence

– The quality and quantity of data are vital for the effectiveness of problem solving. Nowadays, big data analytics, which require managing an immense amount of data rapidly, has attracted more and more attention. It is a new research area in the field of information processing techniques. It faces the big challenges and difficulties of a large amount of data, high dimensionality, and dynamical change of data. However, such issues might be addressed with the help from other research fields, e.g., swarm intelligence (SI), which is a collection of nature-inspired searching techniques. The paper aims to discuss these issues. , – In this paper, the potential application of SI in big data analytics is analyzed. The correspondence and association between big data analytics and SI techniques are discussed. As an example of the application of the SI algorithms in the big data processing, a commodity routing system in a port in China is introduced. Another example is the economic load dispatch problem in the planning of a modern power system. , – The characteristics of big data include volume, variety, velocity, veracity, and value. In the SI algorithms, these features can be, respectively, represented as large scale, high dimensions, dynamical, noise/surrogates, and fitness/objective problems, which have been effectively solved. , – In current research, the example problem of the port is formulated but not solved yet given the ongoing nature of the project. The example could be understood as advanced IT or data processing technology, however, its underlying mechanism could be the SI algorithms. This paper is the first step in the research to utilize the SI algorithm to a big data analytics problem. The future research will compare the performance of the method and fit it in a dynamic real system. , – Based on the combination of SI and data mining techniques, the authors can have a better understanding of the big data analytics problems, and design more effective algorithms to solve real-world big data analytical problems.

[1]  Qingfu Zhang,et al.  Objective Reduction in Many-Objective Optimization: Linear and Nonlinear Algorithms , 2013, IEEE Transactions on Evolutionary Computation.

[2]  Mohamed S. Kamel,et al.  Multiple Cooperating Swarms for Data Clustering , 2007, 2007 IEEE Swarm Intelligence Symposium.

[3]  Michel Verleysen,et al.  Learning high-dimensional data , 2001 .

[4]  Yuhui Shi,et al.  An Optimization Algorithm Based on Brainstorming Process , 2011, Int. J. Swarm Intell. Res..

[5]  Alexander S. Szalay,et al.  Big Data [Guest editorial] , 2011, Comput. Sci. Eng..

[6]  Amit Konar,et al.  Document Clustering Using Differential Evolution , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[7]  Gerhard Reinelt,et al.  TSPLIB - A Traveling Salesman Problem Library , 1991, INFORMS J. Comput..

[8]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[9]  D. Karaboga,et al.  On the performance of artificial bee colony (ABC) algorithm , 2008, Appl. Soft Comput..

[10]  Martin Pelikan,et al.  An introduction and survey of estimation of distribution algorithms , 2011, Swarm Evol. Comput..

[11]  Changhe Li,et al.  A Clustering Particle Swarm Optimizer for Locating and Tracking Multiple Optima in Dynamic Environments , 2010, IEEE Transactions on Evolutionary Computation.

[12]  Jun Zhang,et al.  Evolutionary Computation Meets Machine Learning: A Survey , 2011, IEEE Computational Intelligence Magazine.

[13]  Yuhui Shi,et al.  Particle swarm optimization based semi-supervised learning on Chinese text categorization , 2012, 2012 IEEE Congress on Evolutionary Computation.

[14]  Yuhui Shi,et al.  Experimental Study on Boundary Constraints Handling in Particle Swarm Optimization: From Population Diversity Perspective , 2011, Int. J. Swarm Intell. Res..

[15]  Yaochu Jin,et al.  A comprehensive survey of fitness approximation in evolutionary computation , 2005, Soft Comput..

[16]  Anand Rajaraman,et al.  Mining of Massive Datasets , 2011 .

[17]  Xin Yao,et al.  Differential evolution for high-dimensional function optimization , 2007, 2007 IEEE Congress on Evolutionary Computation.

[18]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[19]  Bernhard Sendhoff,et al.  Evolutionary Complex Engineering Optimization: Opportunities and Challenges , 2013 .

[20]  Hisao Ishibuchi,et al.  Evolutionary many-objective optimization: A short review , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[21]  Dirk Thierens,et al.  The balance between proximity and diversity in multiobjective evolutionary algorithms , 2003, IEEE Trans. Evol. Comput..

[22]  Thomas Stützle,et al.  Ant Colony Optimization , 2009, EMO.

[23]  Xin Yao,et al.  Scalability of generalized adaptive differential evolution for large-scale continuous optimization , 2010, Soft Comput..

[24]  Donald C. Wunsch,et al.  A Comparison Study of Validity Indices on Swarm-Intelligence-Based Clustering , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[26]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[27]  R.W. Morrison,et al.  A test problem generator for non-stationary environments , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[28]  Ram Ganeshan,et al.  Special Issue ofProduction and Operations Managementon “Big Data in Supply Chain Management” , 2015 .

[29]  Thomas E. Potok,et al.  Document clustering using particle swarm optimization , 2005, Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005..

[30]  Chi-Yang Tsai,et al.  Particle swarm optimization with selective particle regeneration for data clustering , 2011, Expert Syst. Appl..

[31]  Jürgen Branke,et al.  Evolutionary optimization in uncertain environments-a survey , 2005, IEEE Transactions on Evolutionary Computation.

[32]  Zbigniew Michalewicz,et al.  Benchmarking Optimization Algorithms: An Open Source Framework for the Traveling Salesman Problem , 2014, IEEE Computational Intelligence Magazine.

[33]  Leandro N. de Castro,et al.  Data Clustering with Particle Swarms , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[34]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[35]  Shengrui Wang,et al.  Particle swarm optimizer for variable weighting in clustering high-dimensional data , 2009, 2009 IEEE Swarm Intelligence Symposium.

[36]  Sankar K. Pal,et al.  Web mining in soft computing framework: relevance, state of the art and future directions , 2002, IEEE Trans. Neural Networks.

[37]  Yaochu Jin,et al.  A Competitive Swarm Optimizer for Large Scale Optimization , 2015, IEEE Transactions on Cybernetics.

[38]  Bernhard Sendhoff,et al.  Evolutionary Complex Engineering Optimization: Opportunities and Challenges [Guest Editorial] , 2013 .

[39]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[40]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[41]  Xin-She Yang,et al.  Large-Scale Global Optimization via Swarm Intelligence , 2014 .

[42]  Zbigniew Michalewicz,et al.  Adaptation in Dynamic Environments: A Case Study in Mission Planning , 2012, IEEE Transactions on Evolutionary Computation.

[43]  Qingfu Zhang,et al.  On the convergence of a class of estimation of distribution algorithms , 2004, IEEE Transactions on Evolutionary Computation.

[44]  Bart Baesens,et al.  Editorial survey: swarm intelligence for data mining , 2010, Machine Learning.

[45]  Xiaodong Li,et al.  Cooperatively Coevolving Particle Swarms for Large Scale Optimization , 2012, IEEE Transactions on Evolutionary Computation.

[46]  Graham Kendall,et al.  A task based approach for a real-world commodity routing problem , 2013, 2013 IEEE Symposium on Computational Intelligence in Production and Logistics Systems (CIPLS).

[47]  Bernhard Sendhoff,et al.  A systems approach to evolutionary multiobjective structural optimization and beyond , 2009, IEEE Computational Intelligence Magazine.

[48]  Eckart Zitzler,et al.  Objective Reduction in Evolutionary Multiobjective Optimization: Theory and Applications , 2009, Evolutionary Computation.

[49]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[50]  Russell C. Eberhart,et al.  Recent advances in particle swarm , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[51]  Yuhui Shi,et al.  Brain Storm Optimization Algorithm , 2011, ICSI.

[52]  Inés María Galván,et al.  AMPSO: A New Particle Swarm Method for Nearest Neighborhood Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[53]  Changhe Li,et al.  A General Framework of Multipopulation Methods With Clustering in Undetectable Dynamic Environments , 2012, IEEE Transactions on Evolutionary Computation.

[54]  Dervis Karaboga,et al.  AN IDEA BASED ON HONEY BEE SWARM FOR NUMERICAL OPTIMIZATION , 2005 .

[55]  Sevan G. Ficici,et al.  Monotonic solution concepts in coevolution , 2005, GECCO '05.

[56]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[57]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[58]  Russell C. Eberhart,et al.  Computational intelligence - concepts to implementations , 2007 .

[59]  Ram Ganeshan,et al.  Special Issue of Production and Operations Management on “Big Data in Supply Chain Management” , 2015 .

[60]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[61]  Yuhui Shi,et al.  Dynamical exploitation space reduction in particle swarm optimization for solving large scale problems , 2012, 2012 IEEE Congress on Evolutionary Computation.

[62]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[63]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[64]  Luca Maria Gambardella,et al.  Ant colony system: a cooperative learning approach to the traveling salesman problem , 1997, IEEE Trans. Evol. Comput..

[65]  M. Slaney,et al.  Locality-Sensitive Hashing for Finding Nearest Neighbors [Lecture Notes] , 2008, IEEE Signal Processing Magazine.

[66]  Qingfu Zhang,et al.  Expensive Multiobjective Optimization by MOEA/D With Gaussian Process Model , 2010, IEEE Transactions on Evolutionary Computation.

[67]  Michael A. Casey,et al.  Locality-Sensitive Hashing for Finding Nearest Neighbors , 2008 .