An Approach for Time-aware Domain-based Social Influence Prediction

Online Social Networks(OSNs) have established virtual platforms enabling people to express their opinions, interests and thoughts in a variety of contexts and domains, allowing legitimate users as well as spammers and other untrustworthy users to publish and spread their content. Hence, the concept of social trust has attracted the attention of information processors/data scientists and information consumers/business firms. One of the main reasons for acquiring the value of Social Big Data (SBD) is to provide frameworks and methodologies using which the credibility of OSNs users can be evaluated. These approaches should be scalable to accommodate large-scale social data. Hence, there is a need for well comprehending of social trust to improve and expand the analysis process and inferring the credibility of SBD. Given the exposed environment's settings and fewer limitations related to OSNs, the medium allows legitimate and genuine users as well as spammers and other low trustworthy users to publish and spread their content. Hence, this paper presents an approach incorporates semantic analysis and machine learning modules to measure and predict users' trustworthiness in numerous domains in different time periods. The evaluation of the conducted experiment validates the applicability of the incorporated machine learning techniques to predict highly trustworthy domain-based users.

[1]  Fabiola S. F. Pereira,et al.  Visual Perception Similarities to Improve the Quality of User Cold Start Recommendations , 2016, Canadian Conference on AI.

[2]  Changsheng Xu,et al.  Topic-Sensitive Influencer Mining in Interest-Based Social Media Networks via Hypergraph Learning , 2014, IEEE Transactions on Multimedia.

[3]  Indrajit Bhattacharya,et al.  Online Topic-based Social Influence Analysis for the Wimbledon Championships , 2015, KDD.

[4]  Kit Yan Chan,et al.  State-of-the-Art Ontology Annotation for Personalised Teaching and Learning and Prospects for Smart Learning Recommender Based on Multiple Intelligence and Fuzzy Ontology , 2018, International Journal of Fuzzy Systems.

[5]  Bilal. Abu Salih,et al.  An Approach For Time-Aware Domain-Based Analysis Of Users’ Trustworthness In Big Social Data , 2015 .

[6]  Mohammad Ali Abbasi,et al.  Measuring User Credibility in Social Media , 2013, SBP.

[7]  Nong Ye,et al.  Naïve Bayes Classifier , 2013 .

[8]  Liang Zhao,et al.  A topic-focused trust model for Twitter , 2016, Comput. Commun..

[9]  Krzysztof Janowicz,et al.  Linked Data, Big Data, and the 4th Paradigm , 2013, Semantic Web.

[10]  Hesham A. Rakha,et al.  Modeling the Perception Reaction Time and Deceleration Level for Different Surface Conditions Using Machine Learning Techniques , 2017 .

[11]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[12]  Supphachai Thaicharoen,et al.  An Experience Report on Building a Big Data Analytics Framework Using Cloudera CDH and RapidMiner Radoop with a Cluster of Commodity Computers , 2019, SCDS.

[13]  M. Janssen,et al.  Factors influencing big data decision-making quality , 2017 .

[14]  Sanghee Oh,et al.  Enriching consumer health vocabulary through mining a social Q&A site: A similarity-based approach , 2017, J. Biomed. Informatics.

[15]  Hongchul Lee,et al.  Sentiment analysis of twitter audiences: Measuring the positive or negative influence of popular twitterers , 2012, J. Assoc. Inf. Sci. Technol..

[16]  Thomas Demeester,et al.  Learning Semantic Similarity for Very Short Texts , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[17]  Jie Wu,et al.  Generating trusted graphs for trust evaluation in online social networks , 2014, Future Gener. Comput. Syst..

[18]  Mete Celik,et al.  Discovering socially similar users in social media datasets based on their socially important locations , 2018, Inf. Process. Manag..

[19]  Uzair Ahmad,et al.  HarVis: An integrated social media content analysis framework for YouTube platform , 2017, Inf. Syst..

[20]  Aravind Shenoy,et al.  Social Media Marketing and SEO , 2016 .

[21]  S. Swamynathan,et al.  Ensemble learning for network data stream classification using similarity and online genetic algorithm classifiers , 2016, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[22]  Arjan Durresi,et al.  A survey of trust management systems for online social communities - Trust modeling, trust inference and attacks , 2016, Knowl. Based Syst..

[23]  J. Alberto Espinosa,et al.  Big Data: Issues and Challenges Moving Forward , 2013, 2013 46th Hawaii International Conference on System Sciences.

[24]  Kit Yan Chan,et al.  CredSaT: Credibility ranking of users in big social data incorporating semantic analysis and temporal factor , 2018, J. Inf. Sci..

[25]  Elisabetta Fersini,et al.  Sentiment Analysis in Social Networks , 2016 .

[26]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[27]  Kristina Lerman,et al.  Using Lists to Measure Homophily on Twitter , 2012 .

[28]  Pornpit Wongthongtham,et al.  Tree-based Classification to Users' Trustworthiness in OSNs , 2018, ICCAE.

[29]  Jie Wu,et al.  FlowTrust: trust inference with network flows , 2011, Frontiers of Computer Science in China.

[30]  Kit Yan Chan,et al.  Social Credibility Incorporating Semantic Analysis and Machine Learning: A Survey of the State-of-the-Art and Future Research Directions , 2019, AINA Workshops.

[31]  Murat Kantarcioglu,et al.  Detecting anomalies in social network data consumption , 2014, Social Network Analysis and Mining.

[32]  Zhiting Hu,et al.  Dynamic User Modeling in Social Media Systems , 2015, TOIS.

[33]  Jie Wu,et al.  Understanding Graph-Based Trust Evaluation in Online Social Networks , 2016, ACM Comput. Surv..

[34]  K. M. George,et al.  Entropy-Based Model for Estimating Veracity of Topics from Tweets , 2017, ICCCI.

[35]  Pornpit Wongthongtham,et al.  Ontology and trust based data warehouse in new generation of business intelligence: State-of-the-art, challenges, and opportunities , 2015, 2015 IEEE 13th International Conference on Industrial Informatics (INDIN).

[36]  Jialiang Chen,et al.  A Novel Topical Authority-Based Microblog Ranking , 2014, APWeb.

[37]  James She,et al.  Characterizing User Connections in Social Media through User-Shared Images , 2018, IEEE Transactions on Big Data.

[38]  Joo Chuan Tong,et al.  Fine-grained sentiment analysis of social media with emotion sensing , 2016, 2016 Future Technologies Conference (FTC).

[39]  Dong Liu,et al.  Influence Analysis Based Expert Finding Model and Its Applications in Enterprise Social Network , 2013, 2013 IEEE International Conference on Services Computing.

[40]  Hajar Mousannif,et al.  Reality mining and predictive analytics for building smart applications , 2019, Journal of Big Data.

[41]  Kit Yan Chan,et al.  Twitter mining for ontology-based domain discovery incorporating machine learning , 2018, J. Knowl. Manag..

[42]  Munindar P. Singh,et al.  Trust-Based Recommendation Based on Graph Similarity , 2010 .

[43]  Mikhail Zymbler,et al.  A machine learning approach to analyze customer satisfaction from airline tweets , 2019, Journal of Big Data.

[44]  Claire Cardie,et al.  A Survey on Assessment and Ranking Methodologies for User-Generated Content on the Web , 2015, ACM Comput. Surv..

[45]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[46]  Jukka Huhtamäki,et al.  Conceptualizing Big Social Data , 2017, Journal of Big Data.

[47]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[48]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[49]  Vasudeva Varma,et al.  Sentiment classification: a lexical similarity based approach for extracting subjectivity in documents , 2010, Information Retrieval.

[50]  M. de Rijke,et al.  Expertise Retrieval , 2012, Found. Trends Inf. Retr..

[51]  Pornpit Wongthongtham,et al.  Ontology-based approach for identifying the credibility domain in social Big Data , 2018, J. Organ. Comput. Electron. Commer..

[52]  Roman Klinger,et al.  On the Semantic Similarity of Disease Mentions in MEDLINE and Twitter , 2018, NLDB.

[53]  Anthoniraj Amalanathan,et al.  A review on user influence ranking factors in social networks , 2016, Int. J. Web Based Communities.

[54]  M. Chuah,et al.  Spam Detection on Twitter Using Traditional Classifiers , 2011, ATC.

[55]  Joseph E. Beck,et al.  Naive Bayes Classifiers for User Modeling , 1999 .

[56]  Pornpit Wongthongtham,et al.  Towards a Methodology for Social Business Intelligence in the Era of Big Social Data Incorporating Trust and Semantic Analysis , 2015, DaEng.

[57]  Xiaolong Zheng,et al.  Detecting popular topics in micro-blogging based on a user interest-based model , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[58]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[59]  Christophe Nicolle,et al.  Understandable Big Data: A survey , 2015, Comput. Sci. Rev..

[60]  FanWei,et al.  Mining big data , 2013 .

[61]  Tim Berners-Lee,et al.  Publishing on the semantic web , 2001, Nature.

[62]  Nathan Marz,et al.  Big Data: Principles and best practices of scalable realtime data systems , 2015 .

[63]  Kalina Bontcheva,et al.  Overview of the Special Issue on Trust and Veracity of Information in Social Media , 2016, TOIS.

[64]  Kehua Guo,et al.  A comprehensive ranking model for tweets big data in online social network , 2016, EURASIP Journal on Wireless Communications and Networking.

[65]  Farid Meziane,et al.  Ultrasound reports standardisation using rhetorical structure theory and domain ontology , 2019, J. Biomed. Informatics X.

[66]  Abu Salih,et al.  Trustworthiness in Social Big Data Incorporating Semantic Analysis, Machine Learning and Distributed Data Processing , 2018 .

[67]  Sung-Hyon Myaeng,et al.  Predicting event mentions based on a semantic analysis of microblogs for inter-region relationships , 2018, J. Inf. Sci..

[68]  Cécile Paris,et al.  A survey of trust in social networks , 2013, CSUR.

[69]  Balakrishnan Chandrasekaran,et al.  What are ontologies, and why do we need them? , 1999, IEEE Intell. Syst..

[70]  Kok Wai Wong,et al.  Unlocking Social Media and User Generated Content as a Data Source for Knowledge Management , 2019, Int. J. Knowl. Manag..

[71]  A. Hermida,et al.  SHARE, LIKE, RECOMMEND , 2012 .

[72]  L. Smith-Lovin,et al.  Homophily in voluntary organizations: Status distance and the composition of face-to-face groups. , 1987 .

[73]  Tao Mei,et al.  Service Quality Evaluation by Exploring Social Users’ Contextual Information , 2016, IEEE Transactions on Knowledge and Data Engineering.

[74]  K. Butner,et al.  How the human-machine interchange will transform business operations , 2019, Strategy & Leadership.

[75]  H. Albrechtsen,et al.  Toward a New Horizon in Information Science: Domain-Analysis , 1995, J. Am. Soc. Inf. Sci..

[76]  Davide Eynard,et al.  Destinations Similarity Based on User Generated Pictures' Tags , 2012, ENTER.

[77]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[78]  Wei Fan,et al.  Mining big data: current status, and forecast to the future , 2013, SKDD.

[79]  Xiao-Jun Zeng,et al.  Twitter-Based Recommender System to Address Cold-Start: A Genetic Algorithm Based Trust Modelling and Probabilistic Sentiment Analysis , 2015, 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI).

[80]  Hernán A. Makse,et al.  Collective Influence Algorithm to find influencers via optimal percolation in massively large social media , 2016, Scientific Reports.

[81]  Brian G. Knight,et al.  Homophily, Group Size, and the Diffusion of Political Information in Social Networks: Evidence from Twitter , 2014 .

[82]  Pornpit Wongthongtham,et al.  A Preliminary Approach to Domain-Based Evaluation of Users' Trustworthiness in Online Social Networks , 2015, 2015 IEEE International Congress on Big Data.

[83]  Zhiguo Zhu,et al.  Measuring influence in online social network based on the user-content bipartite graph , 2015, Comput. Hum. Behav..

[84]  Xiaoyong Li,et al.  Trust Evaluation in Online Social Networks Based on Knowledge Graph , 2018 .

[85]  Jing Song,et al.  Assessment of Tweet Credibility with LDA Features , 2015, WWW.

[86]  Dong Wang,et al.  On Scalable and Robust Truth Discovery in Big Data Social Media Sensing Applications , 2019, IEEE Transactions on Big Data.

[87]  Jason J. Jung,et al.  Social big data: Recent achievements and new challenges , 2015, Information Fusion.

[88]  Haitao Li,et al.  Exploring sharing patterns for video recommendation on YouTube-like social media , 2013, Multimedia Systems.

[89]  InduShobha N. Chengalur-Smith,et al.  The Impact of Data Quality Information on Decision Making: An Exploratory Analysis , 1999, IEEE Trans. Knowl. Data Eng..

[90]  Axel Bruns,et al.  More than a backchannel : Twitter and television , 2013 .

[91]  Mark S. Granovetter T H E S T R E N G T H O F WEAK TIES: A NETWORK THEORY REVISITED , 1983 .

[92]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[93]  C. Bockermann,et al.  Processing Data Streams with the RapidMiner Streams Plugin , 2012 .

[94]  Pekka Pääkkönen,et al.  Evaluating the Quality of Social Media Data in Big Data Architecture , 2015, IEEE Access.

[95]  Akshi Kumar,et al.  Sentiment Analysis on Twitter , 2012 .

[96]  Faizan Abd Jabar,et al.  Predicting customer recommendation towards homestay at West Pahang region , 2017 .

[97]  Mohammed J. Zaki,et al.  ProfileRank: finding relevant content and influential users based on information diffusion , 2013, SNAKDD '13.

[98]  Yoshitaka Sakurai,et al.  Tweet credibility analysis evaluation by improving sentiment dictionary , 2015, 2015 IEEE Congress on Evolutionary Computation (CEC).

[99]  Xiao-Jun Zeng,et al.  ISTS: Implicit social trust and sentiment based approach to recommender systems , 2015, Expert Syst. Appl..

[100]  Bo Zhang,et al.  A trust-based sentiment delivering calculation method in microblog , 2015 .

[101]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[102]  Zhixin Liu,et al.  Affective design using machine learning: a survey and its prospect of conjoining big data , 2018, Int. J. Comput. Integr. Manuf..

[103]  Barbara Poblete,et al.  Twitter under crisis: can we trust what we RT? , 2010, SOMA '10.

[104]  Taghi M. Khoshgoftaar,et al.  Deep learning applications and challenges in big data analytics , 2015, Journal of Big Data.

[105]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[106]  Barry Wellman,et al.  Networked: The New Social Operating System , 2012 .

[107]  Rifat Ozcan,et al.  Classification of news-related tweets , 2017, J. Inf. Sci..

[108]  Miriam Souto Maior Barros,et al.  Networked: the new social operating system , 2015 .

[109]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[110]  Jinjun Chen,et al.  Efficiently Predicting Trustworthiness of Mobile Services Based on Trust Propagation in Social Networks , 2015, Mob. Networks Appl..