With the rapid popularization of the Internet and the multimedia that be deemed to a new information transmission mode, people can not only get the information you want easily, but also post the information that you have in the world. At the same time, with the introduction of a variety of tablet PCs, smart phones and other network terminals, and the emergence of a variety of social networks, greatly accelerated the pace of information on the internet. People can update a variety of text, pictures, video and other data in a variety of applications every day. There is data show that the Internet has an exponential level of information data and news or media company will typically see hundreds and thousands of submissions every day, people have been in a very expansive information time. In the face of such huge information resources, how to manage it effectively, make people get the target information more convenient and fast, has become a hot research topic. And text classification technology in text information mining is effective to solve this problem. We mainly study the mobile text classification technology based on the maximum entropy model and implement the automatic classification system of texts in cloud computing, and through technical improvements, for a large number of documents in the network, given technical solutions in mobile environment. This paper introduces the text classification methods and features of the maximum entropy model with improved information gain selection method and the pretreatment method and the MapReduce programming method, the experimental results have a good accuracy and recall, the classification of large amounts of text, meeting the requirements of practical application.
[1]
Jin Wang,et al.
Botnet Detection Based on Correlation of Malicious Behaviors
,
2013
.
[2]
Chunyong Yin,et al.
Towards Accurate Node-Based Detection of P2P Botnets
,
2014,
TheScientificWorldJournal.
[3]
C. A. Murthy,et al.
Effective Text Classification by a Supervised Feature Selection Approach
,
2012,
2012 IEEE 12th International Conference on Data Mining Workshops.
[4]
Hu Yunfa,et al.
Using Maximum Entropy Model for Chinese Text Categorization
,
2005
.
[5]
Adam L. Berger,et al.
A Maximum Entropy Approach to Natural Language Processing
,
1996,
CL.
[6]
Kang Chen,et al.
Cloud Computing: System Instances and Current Research: Cloud Computing: System Instances and Current Research
,
2010
.
[7]
Jiongmin Yong,et al.
Control theory and related topics : in memory of Xunjing Li, Fudan University, China, 3-5 June 2005
,
2007
.
[8]
Gan Qiu-yu.
Overview of the Chinese Word Segmentation Algorithm
,
2013
.
[9]
Bin Gu,et al.
Incremental Support Vector Learning for Ordinal Regression
,
2015,
IEEE Transactions on Neural Networks and Learning Systems.
[10]
Bin Gu,et al.
Incremental learning for ν-Support Vector Regression
,
2015,
Neural Networks.