Memetic Algorithm Based Feature Selection for Handwritten City Name Recognition

Feature selection plays a key role to reduce the high-dimensionality of feature space in machine learning applications by discarding irrelevant and redundant features with the aim of obtaining a subset of features that accurately describe a given problem with a minimum or no degradation of performance. In this paper, a Memetic Algorithm (MA) based Wrapper-filter feature selection framework is proposed for the recognition of handwritten Bangla city names. For evaluating the MA framework, a recently published feature extraction technique, reported in [1], is used for the said pattern recognition problem. Experimentation is conducted on an in-house dataset of 6000 words written in Bangla script. Here, 40 most popular city names of West Bengal, a state in India, have been considered to prepare the dataset. Proposed technique not only reduces the feature dimension, but also enhances the performance of the word recognition technique significantly.

[1]  Jinchang Ren,et al.  Performance of hidden Markov model and dynamic Bayesian network classifiers on handwritten Arabic word recognition , 2011, Knowl. Based Syst..

[2]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ahmad Faraahi,et al.  A novel memetic feature selection algorithm , 2013, The 5th Conference on Information and Knowledge Technology.

[4]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[5]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[6]  Fumitaka Kimura,et al.  Handwritten Street Name Recognition for Indian Postal Automation , 2011, 2011 International Conference on Document Analysis and Recognition.

[7]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[8]  Edward M. Riseman,et al.  Word spotting: a new approach to indexing handwriting , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Andrew P. Witkin,et al.  Scale-space filtering: A new approach to multi-scale description , 1984, ICASSP.

[10]  Fumitaka Kimura,et al.  Bangla and English City Name Recognition for Indian Postal Automation , 2010, 2010 20th International Conference on Pattern Recognition.

[11]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[12]  Fumitaka Kimura,et al.  Multi-lingual City Name Recognition for Indian Postal Automation , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[13]  Bidyut Baran Chaudhuri,et al.  Automation of Indian Postal Documents Written in Bangla and English , 2009, Int. J. Pattern Recognit. Artif. Intell..

[14]  Khaled Mahmud,et al.  Bangla automatic number plate recognition system using artificial neural network , 2012 .

[15]  Mengjie Zhang,et al.  Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach , 2013, IEEE Transactions on Cybernetics.

[16]  M. W Gardner,et al.  Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences , 1998 .

[17]  Mita Nasipuri,et al.  A holistic word recognition technique for handwritten Bangla words , 2015, Int. J. Appl. Pattern Recognit..

[18]  Mita Nasipuri,et al.  Bangla Handwritten City Name Recognition Using Gradient-Based Feature , 2016, FICTA.

[19]  Amparo Alonso-Betanzos,et al.  Filter Methods for Feature Selection - A Comparative Study , 2007, IDEAL.

[20]  Subhadip Basu,et al.  A hierarchical approach to recognition of handwritten Bangla characters , 2009, Pattern Recognit..

[21]  Kenneth A. De Jong,et al.  A formal analysis of the role of multi-point crossover in genetic algorithms , 1992, Annals of Mathematics and Artificial Intelligence.

[22]  Mita Nasipuri,et al.  Handwritten Bangla Word Recognition Using Elliptical Features , 2014, 2014 International Conference on Computational Intelligence and Communication Networks.

[23]  Subhadip Basu,et al.  Word Extraction and Character Segmentation from Text Lines of Unconstrained Handwritten Bangla Document Images , 2011, J. Intell. Syst..

[24]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[25]  Y. Censor Pareto optimality in multiobjective problems , 1977 .

[26]  Mita Nasipuri,et al.  A Holistic Approach for Handwritten Hindi Word Recognition , 2017, Int. J. Comput. Vis. Image Process..

[27]  Changning Huang,et al.  Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach , 2005, CL.

[28]  Li-Yeh Chuang,et al.  Feature Selection Using Memetic Algorithms , 2008, 2008 Third International Conference on Convergence and Hybrid Information Technology.

[29]  Simon M. Lucas,et al.  Top-Down Likelihood Word Image Generation Model for Holistic Word Recognition , 2002, Document Analysis Systems.

[30]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[31]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[32]  Mita Nasipuri,et al.  Handwritten Bangla Word Recognition Using HOG Descriptor , 2014, 2014 Fourth International Conference of Emerging Applications of Information Technology.

[33]  Fumitaka Kimura,et al.  A Lexicon-Driven Handwritten City-Name Recognition Scheme for Indian Postal Automation , 2009, IEICE Trans. Inf. Syst..

[34]  Javier Pérez-Rodríguez,et al.  A Scalable Memetic Algorithm for Simultaneous Instance and Feature Selection , 2014, Evolutionary Computation.

[35]  Marco Vannucci,et al.  A Hybrid Feature Selection Method for Classification Purposes , 2014, 2014 European Modelling Symposium.

[36]  Prasenjit Dey,et al.  HMM-based Indic handwritten word recognition using zone segmentation , 2016, Pattern Recognit..

[37]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[38]  Xue-wen Chen An improved branch and bound algorithm for feature selection , 2003, Pattern Recognit. Lett..

[39]  David S. Doermann,et al.  The Indexing and Retrieval of Document Images: A Survey , 1998, Comput. Vis. Image Underst..

[40]  Volkmar Frinken,et al.  A Novel Word Spotting Method Based on Recurrent Neural Networks , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[42]  Prasenjit Dey,et al.  A Novel Approach of Bangla Handwritten Text Recognition Using HMM , 2014, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[43]  S B Kotsiantis,et al.  RETRACTED ARTICLE: Feature selection for machine learning classification problems: a recent overview , 2014, Artificial Intelligence Review.

[44]  Oscar Castillo,et al.  A survey on nature-inspired optimization algorithms with fuzzy logic for dynamic parameter adaptation , 2014, Expert Syst. Appl..

[45]  Bidyut Baran Chaudhuri,et al.  A system towards Indian postal automation , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[46]  Byung Ro Moon,et al.  Hybrid Genetic Algorithms for Feature Selection , 2004, IEEE Trans. Pattern Anal. Mach. Intell..