Genetic optimization of big data sentiment analysis

This paper deals with opinion mining from unstructured textual documents. The proposed method focuses on approach with minimum preliminary requirements about the knowledge of the analysed language and thus it can be deployed to any language. The proposed method builds on artificial intelligence, which consists of Support Vector Machines classifier, Big Data analysis and genetic algorithm optimization. To make the optimization feasible together with big data approach we have proposed GA operators, which significantly accelerate conversion to the accurate solutions. In this work we outperformed the traditional approaches (which use language dependent text preprocessing) for text valence classification with the highest achieved accuracy 90.09 %. The data set for validation was Czech texts.

[1]  Malay Kishore Dutta,et al.  Multi-GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL , 2016 .

[2]  Radim Burget,et al.  Recognition of Emotions in Czech Newspaper Headlines , 2011 .

[3]  Thomas Bäck,et al.  Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[4]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[5]  Francisco Herrera,et al.  A Survey on the Application of Genetic Programming to Classification , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6]  Prasanna S. Haddela,et al.  A term weighting method for identifying emotions from text content , 2013, 2013 IEEE 8th International Conference on Industrial and Information Systems.

[7]  Malay Kishore Dutta,et al.  Sentiment analysis based on Support Vector Machine and Big Data , 2016, 2016 39th International Conference on Telecommunications and Signal Processing (TSP).

[8]  Malay Kishore Dutta,et al.  Emotion recognition from helpdesk messages , 2015, 2015 7th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT).

[9]  Viviane Pereira Moreira,et al.  Using information retrieval for sentiment polarity prediction , 2016, Expert Syst. Appl..

[10]  Jantima Polpinij Multilingual Sentiment Classification on Large Textual Data , 2014, 2014 IEEE Fourth International Conference on Big Data and Cloud Computing.

[11]  Malay Kishore Dutta,et al.  Multi-GPU implementation of k-nearest neighbor algorithm , 2015, 2015 38th International Conference on Telecommunications and Signal Processing (TSP).

[12]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[13]  Malay Kishore Dutta,et al.  Job shop scheduling problem with heuristic genetic programming operators , 2015, 2015 2nd International Conference on Signal Processing and Integrated Networks (SPIN).

[14]  Radim Burget,et al.  Optimization Methods in Emotion Recognition System , 2016 .

[15]  Daniel J. Power,et al.  Using ‘Big Data’ for analytics and decision support , 2014, J. Decis. Syst..