Searching Tourism Information by Using Vertical Search Engine Based on Nutch and Solr
暂无分享,去创建一个
Since there exist some issues with traditional search engine in information retrieval, such as huge numbers of results, poor profession, low precision rate and other issues, in this paper, we proposed a Vertical Search Engine based on Nutch and Solr. We used forward iteration most granular segmentation algorithm based on dictionary to achieve Chinese word segmentation, employed Vector Space Model (VSM) based on keywords to implement topic relevance, extended the user search module and the tourism domain word library to collect information, filter information retrieval, and relate word various stages. Experiments were also conducted in order to evaluate the algorithm and the results show that the vertical search engine based on Nutch and Solr which is used for tourism information retrieval can improve the user retrieval precision and meet the professional demand of user retrieval.
[1] Sergey Brin,et al. Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.
[2] Michael J. Cafarella,et al. Building Nutch: Open Source Search , 2004, ACM Queue.
[3] Luiz André Barroso,et al. Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.
[4] Lu Lin. Weight computing method for text feature terms by integrating word sense , 2012 .