Automatic Window Design for Gray-Scale Image Processing Based on Entropy Minimization

This paper generalizes the technique described in [1] to gray-scale image processing applications. This method chooses a subset of variables W (i.e. pixels seen through a window) that maximizes the information observed in a set of training data by mean conditional entropy minimization. The task is formalized as a combinatorial optimization problem, where the search space is the powerset of the candidate variables and the measure to be minimized is the mean entropy of the estimated conditional probabilities. As a full exploration of the search space requires an enormous computational effort, some heuristics of the feature selection literature are applied. The introduced approach is mathematically sound and experimental results with texture recognition application show that it is also adequate to treat problems with gray-scale images.

[1]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[2]  Solomon Kullback,et al.  Information Theory and Statistics , 1970, The Mathematical Gazette.

[3]  David Correa Martins,et al.  W-operator window design by minimization of mean conditional entropy , 2006, Pattern Analysis and Applications.

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  David Correa Martins,et al.  W-operator window design by maximization of training data information , 2004, Proceedings. 17th Brazilian Symposium on Computer Graphics and Image Processing.

[6]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[7]  Roberto Marcondes Cesar Junior,et al.  Feature Selection Based on Fuzzy Distances between Clusters: First Results on Simulated Data , 2001, ICAPR.

[8]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[9]  Junior Barrera,et al.  Automatic Programming of Morphological Machines by PAC Learning , 2000, Fundam. Informaticae.

[10]  E. Soofi Principal Information Theoretic Approaches , 2000 .

[11]  David G. Stork,et al.  Pattern Classification , 1973 .

[12]  Marco Zaffalon,et al.  Robust Feature Selection by Mutual Information Distributions , 2002, UAI.

[13]  Sameer Singh,et al.  Advances in Pattern Recognition — ICAPR 2001 , 2001, Lecture Notes in Computer Science.

[14]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[15]  Lloyd A. Smith,et al.  Feature Selection for Machine Learning: Comparing a Correlation-Based Filter Approach to the Wrapper , 1999, FLAIRS.

[16]  Edward R. Dougherty,et al.  Multiresolution Analysis for Optimal Binary Filters , 2001, Journal of Mathematical Imaging and Vision.

[17]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[18]  A. S. Weigend,et al.  Selecting Input Variables Using Mutual Information and Nonparemetric Density Estimation , 1994 .

[19]  David D. Lewis,et al.  Feature Selection and Feature Extraction for Text Categorization , 1992, HLT.

[20]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..