Analysis on Chinese Segmentation Algorithm of Lucene.net
暂无分享,去创建一个
The segment of Chinese word relies on the Class Analyzer.By analyzing the five built-in analyzers of Lucene.net,it was found that their segment were based on the single character of keywords Analyzer,Standard Analyzer,StopAnalyzer,SimpleAnalyzer and WhitespaceAnalyzer.An improted segment kit for a better Chinese information disposal was added.By testing the three typical kits,ChineseAnalyzer,CJKAnalyzer and IKAnalyzer,it was found that IKAnalyzer which uses Dictionary participle and the positive and negative two-way search method,worked well