Prototype matching finding meaning in the books of the Bible

It is common that text documents are characterised and classified by keywords that the authors use to give and name these text characteristics. Visa et al. (1999; 2000) have, however developed a new methodology based on prototype matching. The prototype is an interesting document or a part of an extracted, interesting text. This prototype is matched with the existing document database or the monitored document flow. Our claim is that the new methodology is capable of extracting meaning automatically from the contents of the document. To verify this hypothesis a test was designed with the Bible. Two different translations, one in English and another in Finnish, were selected as test text material. Verification tests that included the search of the ten nearest books to every book of the Bible were performed with a designed prototype version of the software application. The interesting test results are reported in this paper. The new methodology is based on a hierarchy of self-organizing maps (SOM) and on a smart encoding of words. The words of a text document are encoded. The encoded words are represented as word vectors. The word vectors are clustered by the SOM and this process creates a word map.