Social Credibility Incorporating Semantic Analysis and Machine Learning: A Survey of the State-of-the-Art and Future Research Directions

The wealth of Social Big Data (SBD) represents a unique opportunity for organisations to obtain the excessive use of such data abundance to increase their revenues. Hence, there is an imperative need to capture, load, store, process, analyse, transform, interpret, and visualise such manifold social datasets to develop meaningful insights that are specific to an application’s domain. This paper lays the theoretical background by introducing the state-of-the-art literature review of the research topic. This is associated with a critical evaluation of the current approaches, and fortified with certain recommendations indicated to bridge the research gap.

[1]  Claire Cardie,et al.  A Survey on Assessment and Ranking Methodologies for User-Generated Content on the Web , 2015, ACM Comput. Surv..

[2]  Leah G. Nichols A topic model approach to measuring interdisciplinarity at the National Science Foundation , 2014, Scientometrics.

[3]  Rajeev R. Raje,et al.  Towards trust-based recommender systems for online software services , 2014, CISR '14.

[4]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Indexing , 1999, SIGIR Forum.

[5]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[6]  RuanYefeng,et al.  A survey of trust management systems for online social communities - Trust modeling, trust inference and attacks , 2016 .

[7]  Anthoniraj Amalanathan,et al.  A review on user influence ranking factors in social networks , 2016, Int. J. Web Based Communities.

[8]  Daniele Quercia,et al.  TweetLDA: supervised topic classification and link prediction in Twitter , 2012, WebSci '12.

[9]  M. de Rijke,et al.  Expertise Retrieval , 2012, Found. Trends Inf. Retr..

[10]  S. AnoopV.,et al.  Generating and visualizing topic hierarchies from microblogs: An iterative latent dirichlet allocation approach , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Mohammad Ali Abbasi,et al.  Measuring User Credibility in Social Media , 2013, SBP.

[13]  Xuewei Zhang,et al.  Topic modeling for evaluating students' reflective writing: a case study of pre-service teachers' journals , 2016, LAK.

[14]  Aytug Onan,et al.  LDA-based Topic Modelling in Text Sentiment Classification: An Empirical Analysis , 2016, Int. J. Comput. Linguistics Appl..

[15]  Hongchul Lee,et al.  Sentiment analysis of twitter audiences: Measuring the positive or negative influence of popular twitterers , 2012, J. Assoc. Inf. Sci. Technol..

[16]  Bo Thiesson,et al.  Markov Topic Models , 2009, AISTATS.

[17]  Bin Zhou,et al.  Detecting Malicious Activities Using Backward Propagation of Trustworthiness over Heterogeneous Social Graph , 2013, 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[18]  Jiajie Xu,et al.  A Social Trust Path Recommendation System in Contextual Online Social Networks , 2014, APWeb.

[19]  Jiawei Han,et al.  Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions , 2015, IEEE Transactions on Knowledge and Data Engineering.

[20]  Haggai Roitman,et al.  An author-reader influence model for detecting topic-based influencers in social media , 2014, HT.

[21]  Arjan Durresi,et al.  A survey of trust management systems for online social communities - Trust modeling, trust inference and attacks , 2016, Knowl. Based Syst..

[22]  Muhammad Al-Qurishi,et al.  A multistage credibility analysis model for microblogs , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[23]  Fakhri Karray,et al.  Automatic Document Topic Identification using Wikipedia Hierarchical Ontology , 2012, 2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA).

[24]  Angelo Chianese,et al.  Cultural Heritage and Social Pulse: A Semantic Approach for CH Sensitivity Discovery in Social Media Data , 2016, 2016 IEEE Tenth International Conference on Semantic Computing (ICSC).

[25]  Kit Yan Chan,et al.  CredSaT: Credibility ranking of users in big social data incorporating semantic analysis and temporal factor , 2018, J. Inf. Sci..

[26]  Bobby Bhattacharjee,et al.  Using Trust in Recommender Systems: An Experimental Analysis , 2004, iTrust.

[27]  Matthew Michelson,et al.  Tweet Disambiguate Entities Retrieve Folksonomy SubTree Step 1 : Discover Categories Generate Topic Profile from SubTrees Step 2 : Discover Profile Topic Profile : “ English Football ” “ World Cup ” , 2010 .

[28]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[29]  Marco Basaldella,et al.  A Content-Based Approach to Social Network Analysis: A Case Study on Research Communities , 2015, IRCDL.

[30]  Darko Striga,et al.  How to calculate trust between social network users? , 2012, SoftCOM 2012, 20th International Conference on Software, Telecommunications and Computer Networks.

[31]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[32]  Aixin Sun,et al.  Topic Modeling for Short Texts with Auxiliary Word Embeddings , 2016, SIGIR.

[33]  Zhiting Hu,et al.  Dynamic User Modeling in Social Media Systems , 2015, TOIS.

[34]  Danah Boyd,et al.  Detecting Spam in a Twitter Network , 2009, First Monday.

[35]  Bin Zhou,et al.  Fuzzy Approach Topic Discovery in Health and Medical Corpora , 2017, Int. J. Fuzzy Syst..

[36]  Pornpit Wongthongtham,et al.  Ontology and trust based data warehouse in new generation of business intelligence: State-of-the-art, challenges, and opportunities , 2015, 2015 IEEE 13th International Conference on Industrial Informatics (INDIN).

[37]  Mohammed J. Zaki,et al.  ProfileRank: finding relevant content and influential users based on information diffusion , 2013, SNAKDD '13.

[38]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[39]  Yoshitaka Sakurai,et al.  Tweet credibility analysis evaluation by improving sentiment dictionary , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[40]  Xiao-Jun Zeng,et al.  ISTS: Implicit social trust and sentiment based approach to recommender systems , 2015, Expert Syst. Appl..

[41]  Gary Anthes,et al.  Security in the cloud , 2010, Commun. ACM.

[42]  Axel Polleres,et al.  Enabling Trust and Privacy on the Social Web , 2009 .

[43]  Pornpit Wongthongtham,et al.  Tree-based Classification to Users' Trustworthiness in OSNs , 2018, ICCAE.

[44]  Tim Berners-Lee,et al.  Publishing on the semantic web , 2001, Nature.

[45]  Jimmy J. Lin,et al.  WTF: the who to follow service at Twitter , 2013, WWW.

[46]  Péter Schönhofen,et al.  Identifying Document Topics Using the Wikipedia Category Network , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[47]  Abu Salih,et al.  Trustworthiness in Social Big Data Incorporating Semantic Analysis, Machine Learning and Distributed Data Processing , 2018 .

[48]  Kit Yan Chan,et al.  Twitter mining for ontology-based domain discovery incorporating machine learning , 2018, J. Knowl. Manag..

[49]  Marko Bajec,et al.  Traversal and Relations Discovery among Business Entities and People using Semantic Web Technologies and Trust Management , 2012, DB&IS.

[50]  Bo Zhang,et al.  A trust-based sentiment delivering calculation method in microblog , 2015 .

[51]  Zhixin Liu,et al.  Affective design using machine learning: a survey and its prospect of conjoining big data , 2018, Int. J. Comput. Integr. Manuf..

[52]  Jie Wu,et al.  Generating trusted graphs for trust evaluation in online social networks , 2014, Future Gener. Comput. Syst..

[53]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[54]  Rohini K. Srihari,et al.  Unstructured Data , 2022, Encyclopedia of Big Data.

[55]  Gary Anthes,et al.  Topic models vs. unstructured data , 2010, Commun. ACM.

[56]  Victor C. M. Leung,et al.  Open Issues and Outlook , 2014 .

[57]  Pornpit Wongthongtham,et al.  Towards a Methodology for Social Business Intelligence in the Era of Big Social Data Incorporating Trust and Semantic Analysis , 2015, DaEng.

[58]  Cécile Paris,et al.  A survey of trust in social networks , 2013, CSUR.