The Extended Boyer-Moore-Horspool Algorithm for Locality-Sensitive Pseudo-code

Boyer-Moore-Horspool (BMH) algorithm is known as a very efficient algorithm that finds a place where a certain string specified by the user appears within a longer text string. In this study, we propose the Extended Boyer-Moore-Horspool algorithm that can retrieve a pattern in the sequence of real vectors, rather than in the sequence of the characters. We reproduced the BMH algorithm to the sequence of real vectors by transforming the vectors into pseudo-code expression that consists of multiple integers and by introducing a novel binary relation called ‘semiequivalent.’ We confirmed the practical utility of our algorithm by applying it to the string matching problem of the images from “Minutes of the Imperial Diet,” to which optical character recognition