论文信息 - On the role of words and phrases in automatic text analysis

On the role of words and phrases in automatic text analysis

One of the most crucial operations in automatic information retrieval is the assignment to written texts and documents of appropriate identifiers, capable of representing information content for search and retrieval purposes. This operation known as automatic indexing normally consists in assigning to the documents either single terms, or more specific entities such as phrases, or more general entities such as term classes. A model, known as discrimination value analysis is introduced which assigns an appropriate role in the indexing operation to the terms, term phrases, and thesaurus classes. The model is used to determine effectiveness criteria for the content identifiers and to generate useful indexing policies. Experimental evidence is given to validate the theory.

Gerard Salton | A. Wong | G. Salton | A. Wong

[1] John Ashford. Co-operation in Library Automation; the COLA Project. , 1975 .

[2] George Kingsley Zipf,et al. Human behavior and the principle of least effort , 1949 .

[3] Christine A. Montgomery,et al. Linguistics and information science , 1972, J. Am. Soc. Inf. Sci..

[4] Gerard Salton,et al. On the Specification of Term Values in Automatic Indexing , 1973 .

[5] Gerard Salton,et al. A theory of indexing , 1975, Regional conference series in applied mathematics.

[6] Donald J. Hillman. Negotiation of inquiries in an on-line retrieval system , 1968, Inf. Storage Retr..

[7] Michael E. Lesk,et al. Relevance assessments and retrieval system evaluation , 1968, Inf. Storage Retr..

[8] H. P. Edmundson,et al. Automatic abstracting and indexing—survey and recommendations , 1961, CACM.

[9] Clement T. Yu,et al. A theory of term importance in automatic text analysis , 1974, J. Am. Soc. Inf. Sci..

[10] Clement T. Yu,et al. Precision Weighting—An Effective Automatic Indexing Method , 1976, J. ACM.

[11] Yehoshua Bar-Hillel,et al. Language and Information , 1964 .

[12] Roger C. Schank,et al. Computer Models of Thought and Language , 1974 .

[13] Gerard Salton,et al. Dynamic information and library processing , 1975 .