As the carrier of information transmission, the internet inevitably contains much bad information. In view of this phenomenon, with the purpose of identifying the bad information in the network, we combine existing Chinese text mining technology for experimental research. In combination with the idea of AlphaGo double decision system, the experiment will deal with the text information identification and classification using two system models, so that more accurate results can be obtained. In the experiment, a system does text segmentation and feature selection. Another system uses this method based on rules and statistics to compare text to determine whether or not it is bad information based on the established bad information database. And finally the two system carry out the text classification work. In the meantime, the two-system model worked together to identify and classify the bad information. AlphaGo’s strategy was used to combine the former decentralized methods to make the system as a whole. This enables the system to improve the execution efficiency without reducing the recall rate, and the identification and classification accuracy.
[1]
Zhou Peng.
Two-step text orientation identification based on feature extension
,
2012
.
[2]
Santanu Phadikar,et al.
Automatic Segmentation of Spoken Word Signals into Letters Based on Amplitude Variation for Speech to Text Transcription
,
2015
.
[3]
Liu Jian-hua,et al.
One Improved Content Based Information Filtering Model
,
2004
.
[4]
Jaromir Veber.
Text Classification: Classifying Plain Source Files with Neural Network
,
2010
.
[5]
Jiayin Wang,et al.
AutoReplica: Automatic data replica manager in distributed caching and data processing systems
,
2016,
2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC).
[6]
Li Zhenghua.
Language Technology Platform
,
2011
.
[7]
Li Shenghong,et al.
Research on the Concept Network Technique to the Bad Text Information
,
2006
.