论文信息 - Passage-Based Web Text Mining

Passage-Based Web Text Mining

A large amount of textual information on the Web is very useful information resource. In the past, traditional text mining research treated a text document as a single piece of information. However, some Web documents are long and heterogeneous in their contents. This paper presents a new approach to apply the concept of a passage to Web text mining. A single Web text document is considered as several passages, instead of a single text. The effectiveness is investigated using real Thai Web documents. As the preliminary step, we explore influence of the passage-based method on construction of association rules by comparing rules generated by the passage-based method with those generated by the nonpassage-based method.

Thanaruk Theeramunkong

[1] Huang Yuan,et al. Web mining: knowledge discovery on the Web , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[2] Yonatan Aumann,et al. Text Mining via Information Extraction , 1999, PKDD.