论文信息 - Research on Network Content Audit Based on Information Fingerprint

Research on Network Content Audit Based on Information Fingerprint

Based on the specific features of advertisement robot widely existing in network information, a mixture strategy for network content filtering is presented in this paper. The strategy can determine quickly by calculating the fingerprint of network content, and Bayesian filtering is reused when the strategy can not determine. The result of this experiment shows that the strategy is more advanced than the sole Bayesian method in improving system running efficiency and finding out the phenomenon of advertisement robot.

Hua Jiang | Wanzhen Zhang | Tonglai Liu

[1] Li Xiao-Ming,et al. Two Effective Functions on Hashing URL , 2004 .

[2] Mark Goadrich,et al. The relationship between Precision-Recall and ROC curves , 2006, ICML.

[3] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.

[4] WuWen,et al. A maximal figure-of-merit (MFoM)-learning approach to robust classifier design for text categorization , 2006 .

[5] Chin-Hui Lee,et al. A maximal figure-of-merit (MFoM)-learning approach to robust classifier design for text categorization , 2006, ACM Trans. Inf. Syst..

[6] Patrick Pantel,et al. SpamCop: A Spam Classification & Organisation Program , 1998, AAAI 1998.