Swarm intelligence for natural language processing

Natural language processing NLP is an area dealing with computational methods for achieving human-like language processing. Traditionally, NLP research has been focused on developing efficient and robust algorithms to treat most NLP tasks, including syntactic and semantic analysis, grammar induction, summary and text generation, document clustering and machine translation. Swarm intelligence SI methods are effective to do so, since they have been successfully applied for many real-world problems. Recently, NLP and SI have been active areas of research, joined together more than once to solve problems in NLP field. This paper presents a review of recent developments of SI methods in NLP. It shows that only a few NLP tasks and applications were tackled by using SI-based algorithms. These mainly include text document clustering and classification, text summarisation, word sense disambiguation, information retrieval, and speaker recognition. This study also shows that four SI-based algorithms were examined in NLP field, including ant colony optimisation ACO, particle swarm optimisation PSO, bee swarm optimisation BSO, and firefly algorithm FA, emphasising ACO and PSO as the most investigated algorithms in this field.

[1]  Ziqiang Wang,et al.  A Web Document Retrieval Algorithm Based on Particle Swarm Optimization , 2007, 2007 Second International Conference on Bio-Inspired Computing: Theories and Applications.

[2]  G. Cottrell,et al.  Optimizing Similarity Using Multi-Query Relevance Feedback , 1998, J. Am. Soc. Inf. Sci..

[3]  Janez Brest,et al.  Modified firefly algorithm using quaternion representation , 2013, Expert Syst. Appl..

[4]  V. Rao Vemuri,et al.  An artificial immune system approach to document clustering , 2005, SAC '05.

[5]  Xiaomei Zhang,et al.  A Fuzzy Neural Network Based on Particle Swarm Optimization Applied in the Speech Recognition System , 2008, 2008 Eighth International Conference on Intelligent Systems Design and Applications.

[6]  Feng Wu,et al.  A PSO-Based Web Document Query Optimization Algorithm , 2006, ASWC.

[7]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[8]  Jorng-Tzong Horng,et al.  Applying genetic algorithms to query optimization in document retrieval , 2000, Inf. Process. Manag..

[9]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[10]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.

[11]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[12]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[13]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[14]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[15]  Dervis Karaboga,et al.  AN IDEA BASED ON HONEY BEE SWARM FOR NUMERICAL OPTIMIZATION , 2005 .

[16]  Hussein A. Abbass,et al.  MBO: marriage in honey bees optimization-a Haplometrosis polygynous swarming approach , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[17]  Matthew Goldstein,et al.  Kn -nearest Neighbor Classification , 1972, IEEE Trans. Inf. Theory.

[18]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[19]  Didier Schwab,et al.  A Global Ant Colony Algorithm for Word Sense Disambiguation Based on Semantic Relatedness , 2011, PAAMS.

[20]  G. Kanaan,et al.  Support vector machine text classification system: Using Ant Colony Optimization based feature subset selection , 2008, 2008 International Conference on Computer Engineering & Systems.

[21]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[22]  Luca Maria Gambardella,et al.  Solving symmetric and asymmetric TSPs by ant colonies , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[23]  Habiba Drias,et al.  Cooperative Bees Swarm for Solving the Maximum Weighted Satisfiability Problem , 2005, IWANN.

[24]  Donna K. Harman,et al.  Overview of the First Text REtrieval Conference (TREC-1) , 1992, TREC.

[25]  D. Pham,et al.  THE BEES ALGORITHM, A NOVEL TOOL FOR COMPLEX OPTIMISATION PROBLEMS , 2006 .

[26]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[27]  Iztok Fister,et al.  A comprehensive review of cuckoo search: variants and hybrids , 2013, Int. J. Math. Model. Numer. Optimisation.

[28]  Nasser Ghasem-Aghaee,et al.  Text feature selection using ant colony optimization , 2009, Expert Syst. Appl..

[29]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[30]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[31]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[32]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[33]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[34]  David W. Corne,et al.  Feature subset selection for Arabic document categorization using BPSO-KNN , 2011, 2011 Third World Congress on Nature and Biologically Inspired Computing.

[35]  Shuicheng Yan,et al.  Neighborhood preserving embedding , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[36]  Angel Cobo,et al.  Document Management with Ant Colony Optimization Metaheuristic: A Fuzzy Text Clustering Approach Using Pheromone Trails , 2011 .

[37]  M. Clerc,et al.  The swarm and the queen: towards a deterministic and adaptive particle swarm optimization , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[38]  Gerhard Knolmayer,et al.  Document Classification Methods for Organizing Explicit Knowledge , 2002 .

[39]  Michael K. Ng,et al.  An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[40]  Lenka Lhotská,et al.  Ant inspired techniques in textual information retrieval from a hospital information system , 2011, 2011 Third World Congress on Nature and Biologically Inspired Computing.

[41]  K. Faez,et al.  A speech recognition system based on Structure Equivalent Fuzzy Neural Network trained by Firefly algorithm , 2012, 2012 International Conference on Biomedical Engineering (ICoBE).

[42]  Shi Zhongzhi,et al.  A clustering algorithm based on swarm intelligence , 2001, 2001 International Conferences on Info-Tech and Info-Net. Proceedings (Cat. No.01EX479).

[43]  K. R. Chandran,et al.  An enhanced ACO algorithm to select features for text categorization and its parallelization , 2012, Expert Syst. Appl..

[44]  Naomie Salim,et al.  MMI diversity based text summarization , 2009 .

[45]  Wu Bin,et al.  CSIM: a document clustering algorithm based on swarm intelligence , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[46]  Pasi Luukka,et al.  Feature selection using fuzzy entropy measures with similarity classifier , 2011, Expert Syst. Appl..

[47]  Siu Cheung Hui,et al.  A Novel Ant-Based Clustering Approach for Document Clustering , 2006, AIRS.

[48]  James M. Keller,et al.  Roach Infestation Optimization , 2008, 2008 IEEE Swarm Intelligence Symposium.

[49]  Félix de Moya Anegón,et al.  Document organization using Kohonen's algorithm , 2002, Inf. Process. Manag..

[50]  L.N. de Castro,et al.  Text document classification using swarm intelligence , 2005, International Conference on Integration of Knowledge Intensive Multi-Agent Systems, 2005..

[51]  Jun Zhang,et al.  Keyword Combination Extraction in Text Categorization Based on Ant Colony Optimization , 2009, 2009 International Conference of Soft Computing and Pattern Recognition.

[52]  Xin-She Yang,et al.  Cuckoo Search via Lévy flights , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[53]  Ji-Wei Wu,et al.  A Discrete Particle Swarm Optimization Algorithm for Domain Independent Linear Text Segmentation , 2010, 2010 IEEE International Conference on Granular Computing.

[54]  La Lei,et al.  Text categorization using SVM with exponent weighted ACO , 2012, Proceedings of the 31st Chinese Control Conference.

[55]  Sun Xia,et al.  Web document retrieval using manifold learning and ACO algorithm , 2009, 2009 2nd IEEE International Conference on Broadband Network & Multimedia Technology.

[56]  Ghassan Kanaan,et al.  Text Feature Selection using Particle Swarm Optimization Algorithm , 2009 .

[57]  Mohammad Ehsan Basiri,et al.  A novel hybrid ACO-GA algorithm for text feature selection , 2009, 2009 IEEE Congress on Evolutionary Computation.

[58]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[59]  Mohammad Davarpanah Jazi,et al.  Text-independent speaker verification using ant colony optimization-based selected features , 2011, Expert Syst. Appl..

[60]  Ramiz M. Aliguliyev,et al.  An Optimization Model and DPSO-EDA for Document Summarization , 2011 .

[61]  Naomie Salim,et al.  Fuzzy swarm diversity hybrid model for text summarization , 2010, Inf. Process. Manag..

[62]  Kevin M. Passino,et al.  Biomimicry of bacterial foraging for distributed optimization and control , 2002 .

[63]  Arlindo Silva,et al.  PSO-Tagger: A New Biologically Inspired Approach to the Part-of-Speech Tagging Problem , 2013, ICANNGA.

[64]  Xin-She Yang,et al.  A New Metaheuristic Bat-Inspired Algorithm , 2010, NICSO.

[65]  Habiba Drias,et al.  Bees Swarm Optimization Based Approach for Web Information Retrieval , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[66]  Roberto Navigli,et al.  SemEval-2007 Task 07: Coarse-Grained English All-Words Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[67]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[68]  Cheol-Young Ock,et al.  Word sense disambiguation as a traveling salesman problem , 2013, Artificial Intelligence Review.

[69]  Janez Brest,et al.  A comprehensive review of firefly algorithms , 2013, Swarm Evol. Comput..

[70]  Blayne E. Mayfield,et al.  Slime Mold as a model for numerical optimization , 2008, 2008 IEEE Swarm Intelligence Symposium.

[71]  Xiang Feng,et al.  A New Bio-inspired Approach to the Traveling Salesman Problem , 2009, Complex.

[72]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[73]  Habiba Drias Web Information Retrieval Using Particle Swarm Optimization Based Approaches , 2011, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[74]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[75]  Xin-She Yang,et al.  Firefly Algorithms for Multimodal Optimization , 2009, SAGA.

[76]  Abdelwadood Mesleh,et al.  Chi Square Feature Extraction Based Svms Arabic Language Text Categorization System , 2007 .

[77]  Saleh Alshomrani,et al.  Hybrid ACO and TOFA feature selection approach for text classification , 2012, 2012 IEEE Congress on Evolutionary Computation.

[78]  Hwee Tou Ng,et al.  Feature selection, perceptron learning, and a usability case study for text categorization , 1997, SIGIR '97.

[79]  Karen Spärck Jones Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[80]  Mohamed Cheriet,et al.  A New Approach for Skew Correction of Documents Based on Particle Swarm Optimization , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[81]  Christian Blum,et al.  Swarm Intelligence: Introduction and Applications , 2008, Swarm Intelligence.

[82]  Maria Simi,et al.  Experiments on the Use of Feature Selection and Negative Evidence in Automated Text Categorization , 2000, ECDL.

[83]  Thomas E. Potok,et al.  Document clustering using particle swarm optimization , 2005, Proceedings 2005 IEEE Swarm Intelligence Symposium, 2005. SIS 2005..

[84]  Ebrahim H. Mamdani,et al.  An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller , 1999, Int. J. Hum. Comput. Stud..

[85]  F. Ahmadizar,et al.  Two-stage text feature selection method using fuzzy entropy measure and an t colony optimization , 2012, 20th Iranian Conference on Electrical Engineering (ICEE2012).

[86]  Weiguo Fan,et al.  Trace-Oriented Feature Analysis for Large-Scale Text Data Dimension Reduction , 2011, IEEE Transactions on Knowledge and Data Engineering.

[87]  Shengrui Wang,et al.  Text Clustering via Particle Swarm Optimization , 2009, 2009 IEEE Swarm Intelligence Symposium.