ECharacterize: A Novel Feature Selection-Based Framework for Characterizing Entrepreneurial Influencers in Arabic Twitter

Abstract— Social media are widely used as communication platforms in the world of business. Twitter, in particular, offers valuable opportunities for collaboration due to its open nature. For that, many entrepreneurs employ Twitter for different reasons, such as mobilizing financial resources, get funding, and increase their innovation capabilities. Therefore, they keep looking for local entrepreneurial accounts to help them. Messages from entrepreneurial influencers -opinion leader- increase the information diffusion to entrepreneurs, helping them to find more opportunities. Discovering the characteristics of entrepreneurial influencers in Twitter networks becomes extremely important since it reflects the way to reach entrepreneurs. In the present paper, we propose a novel framework called ECharacterize based on feature selections techniques to discover the characteristics of the entrepreneurial influencer in the Saudi context in a robust manner. The framework extracts abundant influencers’ features and then employs seven state-of-the-art ranking methods to determine the characteristics of the most relevant influencer. It robustly aggregates the lists to come out with the accurate final list using Robust Rank Aggregation. The framework examined on 233,018 real-life Arabic tweets. The results show the ability of the proposed method to distinguish between the influencers by their popularity, reliability and activity level.

[1]  Gabriele Santoro,et al.  Social media as tool for facilitating knowledge creation and innovation in small and medium enterprises , 2018 .

[2]  Esaú Villatoro-Tello,et al.  Towards Automatic Detection of User Influence in Twitter by Means of Stylistic and Behavioral Features , 2014, MICAI.

[3]  Carla Riverola,et al.  Entrepreneurs' Bricolage and Social Media , 2018, 2018 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC).

[4]  Kyumin Lee,et al.  Crowdturfers, Campaigns, and Social Media: Tracking and Revealing Crowdsourced Manipulation of Social Media , 2013, ICWSM.

[5]  Sinan Aral,et al.  Identifying Influential and Susceptible Members of Social Networks , 2012, Science.

[6]  L. Hitt,et al.  Social Is the New Financial: How Startup Social Media Activity Influences Funding Outcomes , 2017 .

[7]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[8]  Sven Laur,et al.  Robust rank aggregation for gene list integration and meta-analysis , 2012, Bioinform..

[9]  Huan Liu,et al.  Incremental Feature Selection , 1998, Applied Intelligence.

[10]  Veneta Andonova,et al.  What Is an Entrepreneurial Ecosystem? , 2018, Entrepreneurial Ecosystems in Unexpected Places.

[11]  Afrand Agah,et al.  Characterizing User Influence Within Twitter , 2017, 3PGCIC.

[12]  James H. Moor,et al.  Knowledge and the Flow of Information. , 1982 .

[13]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[14]  Johannes Fürnkranz,et al.  An Analysis of Rule Evaluation Metrics , 2003, ICML.

[15]  Sushil Jajodia,et al.  Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? , 2012, IEEE Transactions on Dependable and Secure Computing.

[16]  Kellie J. Archer,et al.  Empirical characterization of random forest variable importance measures , 2008, Comput. Stat. Data Anal..

[17]  Appavu Balamurugan,et al.  An Empirical Study on Different Ranking Methods for Effective Data Classification , 2015 .

[18]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[19]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[20]  M. Griffiths,et al.  Environmental Research and Public Health Social Networking Sites and Addiction: Ten Lessons Learned , 2022 .

[21]  Nizar Habash,et al.  MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic , 2014, LREC.

[22]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[23]  Xue Li,et al.  A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications , 2017, Briefings Bioinform..

[24]  Edgar Izquierdo,et al.  Mining Worldwide Entrepreneurs Psycholinguistic Dimensions from Twitter , 2018, 2018 International Conference on eDemocracy & eGovernment (ICEDEG).

[25]  Jean-Valère Cossu,et al.  A review of features for the discrimination of twitter users: application to the prediction of offline influence , 2015, Social Network Analysis and Mining.

[26]  R. Pontius,et al.  Death to Kappa: birth of quantity disagreement and allocation disagreement for accuracy assessment , 2011 .

[27]  Scott Shane,et al.  The Importance of Angel Investing in Financing the Growth of Entrepreneurial Ventures , 2012 .

[28]  K. K. Sahu,et al.  Normalization: A Preprocessing Stage , 2015, ArXiv.

[29]  Wei-Min Shen,et al.  Data Preprocessing and Intelligent Data Analysis , 1997, Intell. Data Anal..

[30]  S. Goetz,et al.  Where do entrepreneurs get information? An analysis of twitter-following patterns , 2018 .

[31]  Chih-Chieh Yang,et al.  Multiclass SVM-RFE for product form feature selection , 2008, Expert Syst. Appl..

[32]  Lloyd A. Smith,et al.  Practical feature subset selection for machine learning , 1998 .