An improved Bayesian text categorization system
暂无分享,去创建一个
The weighted factor of conditional probability in Nave-Bayes was ameliorated,the new factor is product of word's kinds-difference and frequency,which emphasizes words with high word's kinds-difference,incarnates frequency's positivity,on the contrary,reduces the affect of common words.In corpus with 3 ten thousand documents,15 kinds and 244 sub-kinds, the experiment verified this means: MicroF1 increase of 18.9 percent of parent-category,MicroF1 increase of 7.6 percent of sub-category.