An algorithm for suffix stripping
暂无分享,去创建一个
The automatic removal of suffixes from words in English is of particular interest in the field of information retrieval. An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL. Although simple, it performs slightly better than a much more elaborate system with which it has been compared. It effectively works by treating complex suffixes as compounds made up of simple suffixes, and removing the simple suffixes in a number of steps. In each step the removal of the suffix is made to depend upon the form of the remaining stem, which usually involves a measure of its syllable length.
[1] Cyril W. Cleverdon,et al. Factors determining the performance of indexing systems , 1966 .
[2] Cyril W. Cleverdon,et al. Aslib Cranfield research project - Factors determining the performance of indexing systems; Volume 1, Design; Part 2, Appendices , 1966 .
[3] Julie Beth Lovins,et al. Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.
[4] Robert T. Dattola. FIRST: Flexible Information Retrieval System for Text , 1979, J. Am. Soc. Inf. Sci..