Reducing the Computational Cost of Computing Approximated Median Strings

The k-Nearest Neighbour (k-NN) rule is one of the most popular techniques in Pattern Recognition. This technique requires good prototypes in order to achieve good results with a reasonable computational cost. When objects are represented by strings, the Median String of a set of strings could be the best prototype for representing the whole set (i.e., the class of the objects). However, obtaining the Median String is an NP-Hard problem, and only approximations to the median string can be computed with a reasonable computational cost. Although proposed algorithms to obtain approximations to Median String are polynomial, their computational cost is quite high (cubic order), and obtaining the prototypes is very costly. In this work, we propose several techniques in order to reduce this computational cost without degrading the classification performance by the Nearest Neighbour rule.

[1]  E Granum,et al.  Automatically inferred Markov network models for classification of chromosomal band pattern structures. , 1990, Cytometry.

[2]  Francisco Casacuberta,et al.  Topology of Strings: Median String is NP-Complete , 1999, Theor. Comput. Sci..

[3]  Jukka Heikkonen,et al.  Evaluating the performance of three feature sets for brain-computer interfaces with an early stopping MLP committee , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  Enrique Vidal,et al.  Fast Computation of Normalized Edit Distances , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Jim Piper,et al.  Automation of Cytogenetics , 1989 .

[6]  Francisco Casacuberta,et al.  Use of median string for classification , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[7]  E. Granum,et al.  Quantitative analysis of 6985 digitized trypsin G ‐banded human metaphase chromosomes , 1980, Clinical genetics.

[8]  Francisco Casacuberta,et al.  Median strings for k-nearest neighbour classification , 2003, Pattern Recognit. Lett..

[9]  Teuvo Kohonen,et al.  Median strings , 1985, Pattern Recognit. Lett..

[10]  Alfons Juan-Císcar,et al.  Fast Median Search in Metric Spaces , 1998, SSPR/SPR.

[11]  Jens Gregor,et al.  On the Use of Automatically Inferred Markov Networks for Chromosome Analysis , 1989 .

[12]  David G. Stork,et al.  Pattern Classification , 1973 .

[13]  Ferenc Kruzslicz Improved Greedy Algorithm for Computing Approximate Median Strings , 1999, Acta Cybern..