Knowledge retrieval based on text clustering and distributed Lucene

To solve the low performance and efficiency issues of the traditional centralized index when processing large-scale unstructured knowledge,the authors proposed the retrieval algorithm based on text clustering.The algorithm used text clustering algorithm to improve the existing index distribution method,and reduced the search range by judging the query intent through the distance of query and clusters.The experimental results show that the proposed scheme can effectively alleviate the pressure of indexing and retrieval in handling large-scale data.It greatly improves the performance of distributed retrieval,and it still maintains relatively high accuracy rate and recall rate.