Mathematical analysis of documentation systems : An attempt to a theory of classification and search request formulation

Abstract As an attempt to make a general structural theory of information retrieval, a documentation system (DS) is defined as a formal system consisting of (a) a set of objects (documents); (b) a set A + + of elementary attributes (key-words), from which further attributes may be constructed: A + + generates A ; (c) a set of axioms of the form X+ +(x) = M (M ϵ M , M a set of constants) connecting attributes with objects: from the axioms further theorems (= true statements) may be constructed. By use of the theorems, different mappings A → B ( a ) ( B ( a ) set of all subsets of a ) (search question → set of documents retrieved) are defined. The type of a DS depends on two basic decisions: (1) choice of the rules for the construction of attributes and theorems, e.g., logical product in coordinate indexing; links. (2) choice of M : M may consist of the two constants “applicable” and “not applicable”, or some positive integers, … ; Further practical decisions: A + + hierarchical or not; kind of mapping; introduction of roles (= further attributes). The most simple case—ordinary two-valued Coordinate Indexing—is discussed in detail: A is a free distributive (but not Boolean) lattice, the homomorphic image a ring of subsets of a ; instead of negation which is not useful, a useful retrieval operation “praeternegation” is introduced. Furthermore these are discussed: a generalized definition of superimposed coding; some functions for the distance of objects or attributes; optimization and automatic derivation of classifications. The model takes into account term-term relations and document-document relations. It may serve as a structural framework in terms of which the functional problems of retrieval theory may be expressed more clearly.

[1]  Th. P. Loosjes Dokumentation Wissenschaftlicher Literatur , 1962 .

[2]  P. E. Jones,et al.  LINEAR ASSOCIATIVE INFORMATION RETRIEVAL , 1962 .

[3]  H. Gericke,et al.  Theorie der Verbände , 1963 .

[4]  Maurice Coyaud,et al.  Introduction à l'étude des langages documentaires , 1966 .

[5]  Oystein Ore,et al.  On the Foundation of Abstract Algebra. II , 1935 .

[6]  Lauren B. Doyle,et al.  Indexing and abstracting by association , 1962 .

[7]  Mortimer Taube Notes on the use of roles and links in coordinate indexing , 1961 .

[8]  H. Edmund Stiles,et al.  The Association Factor in Information Retrieval , 1961, JACM.

[9]  J. Doyle The New Fertility Testing Tape-Reply , 1961 .

[10]  Koichiro Yamamoto Logarithmic order of free distributive lattice , 1954 .

[11]  C. Osgood,et al.  The Measurement of Meaning , 1958 .

[12]  Brian Vickery,et al.  On retrieval system theory , 1961 .

[13]  R. M. Hayes,et al.  Information Storage and Retrieval: Tools, Elements, Theories , 1964 .

[14]  Lauren B. Doyle,et al.  Semantic Road Maps for Literature Searchers , 1961, JACM.

[15]  Randolph Church,et al.  Nunmerical analysis of certain free distributive structures , 1940 .

[16]  Harold Borko,et al.  The construction of an empirically based mathematically derived classification system , 1899, AIEE-IRE '62 (Spring).

[17]  E. Čech Multiplications on a Complex , 1936 .

[18]  B. Vickery The structure of “semantic coding,” a review , 1959 .

[19]  M. E. Maron,et al.  On Relevance, Probabilistic Indexing and Information Retrieval , 1960, JACM.

[20]  Gerard Salton Some hierarchical models for automatic document retrieval , 1963 .

[21]  Allen Kent,et al.  Information Retrieval and Machine Translation. , 1961 .

[22]  Harold Borko,et al.  Automatic Document Classification Part II . Additional Experiments , 1964, JACM.

[23]  Calvin N. Mooers Some mathematical fundamentals of the use of symbols in information retrieval , 1959, IFIP Congress.