Use of median string for classification

A string that minimizes the sum of distances to the strings of a given set is known as (generalized) median string of the set. This concept is important in pattern recognition for modelling a (large) set of garbled strings or patterns. The search of such a string is an NP-Hard problem and, therefore, no efficient algorithms to compute the median strings can be designed. A greedy approach has been proposed to compute an approximate median string of a set of strings. In this work an algorithm is proposed that iteratively improves the approximate solution given above. Experiments have been carried out on synthetic and real data to compare the performances of the approximate median string with the conventional set median. These experiments showed that the proposed median string is a better representation of a given set than the corresponding set median.

[1]  Teuvo Kohonen,et al.  Median strings , 1985, Pattern Recognit. Lett..

[2]  Gabriela Andreu,et al.  Selecting the toroidal self-organizing feature maps (TSOFM) best organized to object recognition , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[3]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[4]  Enrique Vidal,et al.  Finite-state speech-to-speech translation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Alfons Juan-Císcar,et al.  Fast Median Search in Metric Spaces , 1998, SSPR/SPR.

[6]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[7]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[8]  Van Nam Tran,et al.  Syntactic pattern recognition , 1978 .

[9]  Joseph B. Kruskal,et al.  Time Warps, String Edits, and Macromolecules , 1999 .

[10]  Ferenc Kruzslicz Improved Greedy Algorithm for Computing Approximate Median Strings , 1999, Acta Cybern..

[11]  Francisco Casacuberta,et al.  Topology of Strings: Median String is NP-Complete , 1999, Theor. Comput. Sci..