Sentiment classification for employees reviews using regression vector- stochastic gradient descent classifier (RV-SGDC)

The satisfaction of employees is very important for any organization to make sufficient progress in production and to achieve its goals. Organizations try to keep their employees satisfied by making their policies according to employees’ demands which help to create a good environment for the collective. For this reason, it is beneficial for organizations to perform staff satisfaction surveys to be analyzed, allowing them to gauge the levels of satisfaction among employees. Sentiment analysis is an approach that can assist in this regard as it categorizes sentiments of reviews into positive and negative results. In this study, we perform experiments for the world’s big six companies and classify their employees’ reviews based on their sentiments. For this, we proposed an approach using lexicon-based and machine learning based techniques. Firstly, we extracted the sentiments of employees from text reviews and labeled the dataset as positive and negative using TextBlob. Then we proposed a hybrid/voting model named Regression Vector-Stochastic Gradient Descent Classifier (RV-SGDC) for sentiment classification. RV-SGDC is a combination of logistic regression, support vector machines, and stochastic gradient descent. We combined these models under a majority voting criteria. We also used other machine learning models in the performance comparison of RV-SGDC. Further, three feature extraction techniques: term frequency-inverse document frequency (TF-IDF), bag of words, and global vectors are used to train learning models. We evaluated the performance of all models in terms of accuracy, precision, recall, and F1 score. The results revealed that RV-SGDC outperforms with a 0.97 accuracy score using the TF-IDF feature due to its hybrid architecture.

[1]  S. Mehrotra,et al.  Genetic association between CDKN2B/CDKN2B-AS1 gene polymorphisms with primary glaucoma in a North Indian cohort: an original study and an updated meta-analysis , 2021, BMC Medical Genomics.

[2]  José Augusto Baranauskas,et al.  How Many Trees in a Random Forest? , 2012, MLDM.

[3]  Özge Bakay,et al.  HisNet: A Polarity Lexicon based on WordNet for Emotion Analysis , 2021, GWC.

[4]  Si Feng Job satisfaction, management sentiment, and financial performance: Text analysis with job reviews from indeed.com , 2020, Int. J. Inf. Manag. Data Insights.

[5]  Franciska de Jong,et al.  Sentiment Analysis and the Impact of Employee Satisfaction on Firm Earnings , 2014, ECIR.

[6]  J. Vincent,et al.  Biallelic mutations in the death domain of PIDD1 impair caspase-2 activation and are associated with intellectual disability , 2021, Translational Psychiatry.

[7]  Vedran Mrzljak,et al.  Modeling the Spread of COVID-19 Infection Using a Multilayer Perceptron , 2020, Comput. Math. Methods Medicine.

[8]  Gyu Sang Choi,et al.  Review prognosis system to predict employees job satisfaction using deep neural network , 2021, Comput. Intell..

[9]  Erik Cambria,et al.  ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis , 2021, Future Gener. Comput. Syst..

[10]  Yeonjae Jung,et al.  Mining the voice of employees: A text mining approach to identifying and analyzing job satisfaction factors from online employee reviews , 2019, Decis. Support Syst..

[11]  W. Aslam,et al.  A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis , 2021, PloS one.

[12]  Martine De Cock,et al.  High performance logistic regression for privacy-preserving genome analysis , 2020, BMC Medical Genomics.

[13]  V. Khanna,et al.  Employer branding through crowdsourcing: understanding the sentiments of employees , 2020 .

[14]  Yanli Wu,et al.  Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping , 2020 .

[15]  Sven F. Crone,et al.  The impact of preprocessing on data mining: An evaluation of classifier sensitivity in direct marketing , 2006, Eur. J. Oper. Res..

[16]  Dimitris Kanellopoulos,et al.  Data Preprocessing for Supervised Leaning , 2007 .

[17]  Hadi Veisi,et al.  Sentiment analysis based on improved pre-trained word embeddings , 2019, Expert Syst. Appl..

[18]  Erik Cambria,et al.  Aspect-Sentiment Embeddings for Company Profiling and Employee Opinion Mining , 2019, CICLing.

[19]  Gyu Sang Choi,et al.  Impact of SMOTE on Imbalanced Text Features for Toxic Comments Classification Using RVVC Model , 2021, IEEE Access.

[20]  Gyu Sang Choi,et al.  Minimizing the Overlapping Degree to Improve Class-Imbalanced Learning Under Sparse Feature Selection: Application to Fraud Detection , 2021, IEEE Access.

[21]  Gyu Sang Choi,et al.  Classification of Shopify App User Reviews Using Novel Multi Text Features , 2020, IEEE Access.

[22]  Mengchu Zhou,et al.  Dynamic Embedding Projection-Gated Convolutional Neural Networks for Text Classification , 2021, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Mohamed H. Haggag,et al.  An Enhanced Sentiment Analysis Framework Based on Pre-Trained Word Embedding , 2020, Int. J. Comput. Intell. Appl..

[24]  Gyu Sang Choi,et al.  Tweets Classification on the Base of Sentiments for US Airline Companies , 2019, Entropy.

[25]  C. S. Rai,et al.  Ensemble Based Approach for Intrusion Detection Using Extra Tree Classifier , 2020 .

[26]  Adriano Veloso,et al.  Employee Analytics through Sentiment Analysis , 2015, SBBD.

[27]  Peter Szolovits,et al.  Hard for humans, hard for machines: predicting readmission after psychiatric hospitalization using narrative notes , 2021, Translational Psychiatry.

[28]  Yimiao Huang,et al.  Large group activity security risk assessment and risk early warning based on random forest algorithm , 2021, Pattern Recognit. Lett..

[29]  Furqan Rustam,et al.  US Based COVID-19 Tweets Sentiment Analysis Using TextBlob and Supervised Machine Learning Algorithms , 2021, 2021 International Conference on Artificial Intelligence (ICAI).

[30]  Qunxiong Zhu,et al.  Text Classification Using Novel Term Weighting Scheme-Based Improved TF-IDF for Internet Media Reports , 2021 .

[31]  Özgür Sahin,et al.  Develop Intelligent iOS Apps with Swift: Understand Texts, Classify Sentiments, and Autodetect Answers in Text Using NLP , 2021 .

[32]  Haoran Xie,et al.  Sentiment strength detection with a context-dependent lexicon-based convolutional neural network , 2020, Inf. Sci..

[33]  Duojiao Li,et al.  Text sentiment analysis based on Glove model and United Network , 2021 .

[34]  Abdullateef Oluwagbemiga Balogun,et al.  AI Meta-Learners and Extra-Trees Algorithm for the Detection of Phishing Websites , 2020, IEEE Access.

[35]  Dimitrios Buhalis,et al.  Job satisfaction and employee turnover determinants in high contact services: Insights from Employees’Online reviews , 2019, Tourism Management.

[36]  Wu Peng Big Data Mining and Analysis Based on Convolutional Fuzzy Neural Network , 2021 .

[37]  C. Kuzey Impact of Health Care Employees’ Job Satisfaction on Organizational Performance Support Vector Machine Approach , 2018 .

[38]  Nyoman Juniarta,et al.  Aspect based Sentiment Analysis of Employee’s Review Experience , 2020 .

[39]  S. Rajendran Improving the performance of global courier & delivery services industry by analyzing the voice of customers and employees using text analytics , 2020, International Journal of Logistics Research and Applications.

[40]  Gyu Sang Choi,et al.  Employees reviews classification and evaluation (ERCE) model using supervised machine learning approaches , 2021, Journal of Ambient Intelligence and Humanized Computing.

[41]  Walaa Medhat,et al.  Sentiment analysis algorithms and applications: A survey , 2014 .