An approach to automatic classification of text for information retrieval
暂无分享,去创建一个
In this paper, we explore an approach to make better use of semi-structured documents in information retrieval in the domain of biology. Using machine learning techniques, we make those inherent structures explicit by XML markups. This marking up has great potentials in improving task performance in specimen identification and the usability of online flora and fauna.
[1] Stephen Soderland,et al. Learning Information Extraction Rules for Semi-Structured and Free Text , 1999, Machine Learning.
[2] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.