Detection of dishonest behaviors in on-line networks using graph-based ranking techniques

Dishonest behaviors in on-line networks include the problems caused by those actions performed by certain elements in a network in order to obtain some kind of benefits from the system. The analysis of this phenomenon concerns the WWW from two points of view: the Web as a collection of interrelated documents, and the social networks. In this work we study the web spam detection and the computation of trust and reputation in on-line social networks. We propose two graph-based ranking algorithms, based on different propagation models that spread the information from a set of elements in the network to compute the global relevance of all the nodes in the system.

[1]  Horst Bischof,et al.  Assessing the Quality of Web Content , 2014, ArXiv.

[2]  Walter Daelemans,et al.  Improving Accuracy in word class tagging through the Combination of Machine Learning Systems , 2001, CL.

[3]  Minyi Guo,et al.  A class-feature-centroid classifier for text categorization , 2009, WWW '09.

[4]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[5]  Rajeev Motwani,et al.  Stratified Planning , 2009, IJCAI.

[6]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[7]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[8]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[9]  Carlos G. Vallejo,et al.  InstanceRank: Bringing order to datasets , 2010, Pattern Recognit. Lett..

[10]  Timothy W. Finin,et al.  SVMs for the Blogosphere: Blog Identification and Splog Detection , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[11]  Shlomo Moran,et al.  SALSA: the stochastic approach for link-structure analysis , 2001, TOIS.

[12]  Giuseppe Attardi,et al.  Ranking very many typed entities on wikipedia , 2007, CIKM '07.

[13]  Akbar Ghaffarpour Rahbar,et al.  PowerTrust: A Robust and Scalable Reputation System for Trusted Peer-to-Peer Computing , 2007, IEEE Transactions on Parallel and Distributed Systems.

[14]  Alexander Aiken,et al.  Attack-Resistant Trust Metrics for Public Key Certification , 1998, USENIX Security Symposium.

[15]  Rada Mihalcea,et al.  Random Walk Term Weighting for Improved Text Classification , 2007, Int. J. Semantic Comput..

[16]  Brian D. Davison,et al.  Topical TrustRank: using topicality to combat web spam , 2006, WWW '06.

[17]  Pasquale Lops,et al.  Content-based Recommender Systems: State of the Art and Trends , 2011, Recommender Systems Handbook.

[18]  Robin Cohen,et al.  Smart cheaters do prosper: defeating trust and reputation systems , 2009, AAMAS.

[19]  Edward Y. Chang,et al.  Mining blog stories using community-based and temporal clustering , 2006, CIKM '06.

[20]  Chunyan Miao,et al.  A Survey of Trust and Reputation Management Systems in Wireless Communications , 2010, Proceedings of the IEEE.

[21]  Ling Liu,et al.  PeerTrust: supporting reputation-based trust for peer-to-peer electronic communities , 2004, IEEE Transactions on Knowledge and Data Engineering.

[22]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[23]  Piek Vossen Introduction to EuroWordNet , 1998 .

[24]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[25]  Yun Chi,et al.  Splog Detection using Content, Time and Link Structures , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[26]  Rashmi Raj,et al.  Web Spam Detection with Anti-Trust Rank , 2006, AIRWeb.

[27]  Jon M. Kleinberg,et al.  The Web as a Graph: Measurements, Models, and Methods , 1999, COCOON.

[28]  Massimo Marchiori,et al.  The Quest for Correct Information on the Web: Hyper Search Engines , 1997, Comput. Networks.

[29]  Cristina Nita-Rotaru,et al.  A survey of attack and defense techniques for reputation systems , 2009, CSUR.

[30]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[31]  Xiaojin Zhu,et al.  Improving Diversity in Ranking using Absorbing Random Walks , 2007, NAACL.

[32]  B. Tseng,et al.  Tomographic Clustering To Visualize Blog Communities as Mountain Views , 2005 .

[33]  Stan Matwin,et al.  A WordNet-based Algorithm for Word Sense Disambiguation , 1995, IJCAI.

[34]  D. Sculley,et al.  Relaxed online SVMs for spam filtering , 2007, SIGIR.

[35]  Brian D. Davison,et al.  Propagating Trust and Distrust to Demote Web Spam , 2006, MTW.

[36]  Hae-Chang Rim,et al.  Unsupervised word sense disambiguation using WordNet relatives , 2004, Comput. Speech Lang..

[37]  Yehuda Koren,et al.  Advances in Collaborative Filtering , 2011, Recommender Systems Handbook.

[38]  Nathan Schneider,et al.  Association for Computational Linguistics: Human Language Technologies , 2011 .

[39]  Ramanathan V. Guha,et al.  Propagation of trust and distrust , 2004, WWW '04.

[40]  Marc Najork Web Spam Detection , 2009, Encyclopedia of Database Systems.

[41]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[42]  Fermín L. Cruz,et al.  Propagation of trust and distrust for the detection of trolls in a social network , 2012, Comput. Networks.

[43]  Munindar P. Singh,et al.  Detecting deception in reputation management , 2003, AAMAS '03.

[44]  Christian Bauckhage,et al.  The slashdot zoo: mining a social network with negative edges , 2009, WWW.

[45]  Rada Mihalcea,et al.  Unsupervised graph-based word sense disambiguation , 2009 .

[46]  Fermín L. Cruz,et al.  Improving the Performance of a Tagger Generator in an Information Extraction Application , 2007, J. Univers. Comput. Sci..

[47]  Raph Levien,et al.  Attack-Resistant Trust Metrics , 2009, Computing with Social Trust.

[48]  Craig MacDonald,et al.  Overview of the TREC 2009 Blog Track , 2009, TREC.

[49]  Tie-Yan Liu,et al.  Detecting Link Spam Using Temporal Information , 2006, Sixth International Conference on Data Mining (ICDM'06).

[50]  T. Valente,et al.  Identifying Opinion Leaders to Promote Behavior Change , 2007, Health education & behavior : the official publication of the Society for Public Health Education.

[51]  Eli Upfal,et al.  Stochastic models for the Web graph , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[52]  Amy Greenwald,et al.  More efficient parallel computation of pagerank , 2007, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[53]  Craig MacDonald,et al.  How diverse are web search results? , 2011, SIGIR '11.

[54]  Ludovic Denoyer,et al.  MADSPAM Consortium at the ECML/PKDD Discovery Challenge 2010 , 2010 .

[55]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .