论文信息 - PAT-tree-based keyword extraction for Chinese information retrieval

PAT-tree-based keyword extraction for Chinese information retrieval

urgent need to promote Chinese in this paper we will raise the significance of keyword extraction using a new PAT-treebased approach, which is efficient in automatic keyword extraction from a set of relevant Chinese documents. This approach has been successfully applied in several IR researches, such as document classification, book indexing and relevance feedback. Many Chinese language processing applications therefore step ahead from character level to word/phrase level,

Lee-Feng Chien | T.-I. Huang | M-C. Chien | Lee-Feng Chien

[1] W. Bruce Croft. Clustering large files of documents using the single-link method , 1977, J. Am. Soc. Inf. Sci..

[2] Ricardo A. Baeza-Yates,et al. An Algorithm for String Matching with a Sequence of don't Cares , 1991, Inf. Process. Lett..

[3] Lee-Feng Chien. Fast and quasi-natural language search for gigabytes of Chinese texts , 1995, SIGIR '95.

[4] Donald R. Morrison,et al. PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric , 1968, J. ACM.

[5] Zimin Wu,et al. Chinese Text Segmentation for Text Retrieval: Achievements and Problems , 1993, J. Am. Soc. Inf. Sci..

[6] Christos Faloutsos,et al. Access methods for text , 1985, CSUR.

[7] Tsung-Yih Tseng,et al. A Corpus-Based Statistical Approach to Automatic Book Indexing , 1992, ANLP.

[8] Esen A. Ozkarahan,et al. Two partitioning type clustering algorithms , 1984, J. Am. Soc. Inf. Sci..

[9] Ogawa Yasushi,et al. A new character-based indexing method using frequency data for Japanese documents , 1995, SIGIR 1995.

[10] L. Tyne,et al. Optimal Weight Assignment for a Chinese Signature File , 1996, Inf. Process. Manag..

[11] Keh-Jiann Chen,et al. Word Identification for Mandarin Chinese Sentences , 1992, COLING.

[12] Masajiro Iwasaki,et al. A New Character-based Indexing Organization using Frequency Data for Japanese Documents. , 1995, SIGIR 1995.

[13] Gaston H. Gonnet,et al. New Indices for Text: Pat Trees and Pat Arrays , 1992, Information Retrieval: Data Structures & Algorithms.

[14] Jian-Yun Nie,et al. On Chinese text retrieval , 1996, SIGIR '96.

[15] Nicholas J. Belkin,et al. Information filtering and information retrieval: two sides of the same coin? , 1992, CACM.