User behavior analysis-based smart energy management for webpage ranking: Learning automata-based solution

Abstract Search engines are widely used for surfing the Internet. Different search engines vary with respect to their accuracy and time to fetch the information from the distributed/centralized database repository across the globe. However, it has been found in the literature that webpage ranking helps in saving the user's surfing time which in turn saves considerable energy consumption during computation and transmission across the network. Most of the earlier solutions reported in the literature uses the hyperlink structure of graph which consume a lot of energy during the computation. It may lead to the link leakage problem with the occurrence of spam pages more often. Nowadays, hyperlink structure alone is inadequate for predicting webpage importance keeping in view of the energy consumption of various smart devices. User browsing behavior depicts its real importance. It is essential to demote the spam pages to increase the search engine accuracy and speed. Hence, user behavior analysis along with demotion of spam pages can improve Search Engine Result Pages (SERP) which in turn saves the energy consumption. In the proposed approach, web page importance score is computed by analyzing user surfing behavior attributes, dwell time, and click count. After computing the webpage importance score, the ranks are revised by implementing it in Learning Automata (LA) environment. Learning automaton is the stochastic system which learns from the environment and responds either with a reward or a penalty. With every response from the environment, the probability of visiting the webpage is updated. Probability computation is done using Normal and Gamma distribution functions. In the proposal, we have considered only the dangling pages for experiments. Inactive webpages are punished and degraded from the system. We have validated proposed approach with Microsoft Learning to Rank dataset. It has been found in the experiments performed that 3403 dangling pages out of 12211 dangling pages have been degraded using the proposed scheme. The objective of the proposed scheme is achieved by saving web energy and computational cost. It takes 100 iterations to convergence which executed in 21.88 ms. However, the user behavior analysis helped in improving PageRank score of the webpages.

[1]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[2]  Yiqun Liu,et al.  Identifying web spam with user behavior analysis , 2008, AIRWeb '08.

[3]  Virgílio A. F. Almeida,et al.  Detecting Spammers and Content Promoters in Online Video Social Networks , 2009, IEEE INFOCOM Workshops 2009.

[4]  Mohamed Elhoseny,et al.  Secure Automated Forensic Investigation for Sustainable Critical Infrastructures Compliant with Green Computing Requirements , 2020, IEEE Transactions on Sustainable Computing.

[5]  Yin Zhang,et al.  Measuring and fingerprinting click-spam in ad networks , 2012, SIGCOMM.

[6]  Zhu Han,et al.  The Accuracy-Privacy Trade-off of Mobile Crowdsensing , 2017, IEEE Communications Magazine.

[7]  Zhu Han,et al.  Coalition Formation Games for Distributed Cooperation Among Roadside Units in Vehicular Networks , 2010, IEEE Journal on Selected Areas in Communications.

[8]  Azadeh Shakery,et al.  DirichletRank: Solving the zero-one gap problem of PageRank , 2008, TOIS.

[9]  Der-Jiunn Deng,et al.  LA-EEHSC: Learning automata-based energy efficient heterogeneous selective clustering for wireless sensor networks , 2014, J. Netw. Comput. Appl..

[10]  Yimin Wei,et al.  On computing PageRank via lumping the Google matrix , 2009 .

[11]  Haiying Shen,et al.  Leveraging Social Networks for Effective Spam Filtering , 2014, IEEE Transactions on Computers.

[12]  Kevin S. McCurley,et al.  Ranking the web frontier , 2004, WWW '04.

[13]  V. Lakshmi Praba,et al.  Efficient hyperlink analysis using robust Proportionate Prestige Score in PageRank algorithm , 2014, Appl. Soft Comput..

[14]  Qing Hu,et al.  Community-Aware Data Propagation with Small World Feature for Internet of Vehicles , 2018, IEEE Communications Magazine.

[15]  Chong Wang,et al.  Webpage Depth-level Dwell Time Prediction , 2016, CIKM.

[16]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[17]  Arun Kumar Sangaiah,et al.  A Robust Time Synchronization Scheme for Industrial Internet of Things , 2018, IEEE Transactions on Industrial Informatics.

[18]  Georgia Koutrika,et al.  Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges , 2007, IEEE Internet Computing.

[19]  Susan T. Dumais,et al.  Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[21]  Joel J. P. C. Rodrigues,et al.  Intelligent Mobile Video Surveillance System as a Bayesian Coalition Game in Vehicular Sensor Networks: Learning Automata Approach , 2015, IEEE Transactions on Intelligent Transportation Systems.

[22]  Gang Wu,et al.  Lumping algorithms for computing Google’s PageRank and its derivative, with attention to unreferenced nodes , 2012, Information Retrieval.

[23]  Javad Akbari Torkestani An adaptive learning automata-based ranking function discovery algorithm , 2012, Journal of Intelligent Information Systems.

[24]  Neeraj Kumar,et al.  Providing healthcare services on-the-fly using multi-player cooperation game theory in Internet of Vehicles (IoV) environment , 2015, Digit. Commun. Networks.

[25]  Ryen W. White,et al.  Mining the search trails of surfing crowds: identifying relevant websites from user activity , 2008, WWW.

[26]  Suju Rajan,et al.  Beyond clicks: dwell time for personalization , 2014, RecSys '14.

[27]  Ilse C. F. Ipsen,et al.  PageRank Computation, with Special Attention to Dangling Nodes , 2007, SIAM J. Matrix Anal. Appl..

[28]  Sang Ho Lee,et al.  An Improved Computation of the PageRank Algorithm , 2002, ECIR.

[29]  Mohammad S. Obaidat,et al.  Collaborative Learning Automata-Based Routing for Rescue Operations in Dense Urban Regions Using Vehicular Sensor Networks , 2015, IEEE Systems Journal.

[30]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[31]  Sukomal Pal,et al.  Recent developments in social spam detection and combating techniques: A survey , 2016, Inf. Process. Manag..

[32]  Brian D. Davison,et al.  Topical TrustRank: using topicality to combat web spam , 2006, WWW '06.

[33]  Tie-Yan Liu,et al.  A framework to compute page importance based on user behaviors , 2010, Information Retrieval.

[34]  Zifan Liu,et al.  PageRank Computation Using a Multiple Implicitly Restarted Arnoldi Method for Modeling Epidemic Spread , 2014, International Journal of Parallel Programming.

[35]  Mounia Lalmas,et al.  Absence time and user engagement: evaluating ranking functions , 2013, WSDM '13.

[36]  Hao Jiang,et al.  Mining User Dwell Time for Personalized Web Search Re-Ranking , 2011, IJCAI.

[37]  Arun Kumar Sangaiah,et al.  A Lifetime-Enhanced Data Collecting Scheme for the Internet of Things , 2017, IEEE Communications Magazine.

[38]  Fabrício Benevenuto,et al.  Detecting tip spam in location-based social networks , 2013, SAC '13.

[39]  Mohammad Reza Meybodi,et al.  Web page ranking based on fuzzy and learning automata , 2009, MEDES.

[40]  Tao Wang,et al.  Adaptive Communication Protocols in Flying Ad Hoc Network , 2018, IEEE Communications Magazine.

[41]  Vahab S. Mirrokni,et al.  Robust PageRank and locally computable spam detection features , 2008, AIRWeb '08.

[42]  Zhijin Qin,et al.  User Association and Resource Allocation in Unified NOMA Enabled Heterogeneous Ultra Dense Networks , 2018, IEEE Communications Magazine.

[43]  Padmini Srinivasan,et al.  Spam detection in online classified advertisements , 2011, WebQuality '11.

[44]  Alex Talevski,et al.  Behaviour-Based Web Spambot Detection by Utilising Action Time and Action Frequency , 2010, ICCSA.

[45]  B. John Oommen,et al.  Random Early Detection for Congestion Avoidance in Wired Networks: A Discretized Pursuit Learning-Automata-Like Solution , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  Naomie Salim,et al.  Detection of review spam: A survey , 2015, Expert Syst. Appl..

[47]  Yiqun Liu,et al.  Identifying Web Spam with the Wisdom of the Crowds , 2012, TWEB.