A bias-tuned dishonesty-resistant reputation evaluation method for trust establishment in grid

In grid, cooperation often occurs between unknown entities. To guarantee smooth and reliable ongoing of such cooperation, reliable trust relationship must be established between them. With wide application in electronic commerce and online communities, reputation mechanism emerges as a promising solution, where a scientific reputation evaluation method is crucial. Yet, most methods currently available do not consider grid's distinct characteristics such as the sparseness of ratings and strangeness of entities, which are not feasible to grid. In this paper, we propose a bias-tuned dishonesty-resistant reputation evaluation method for trust establishment in grid. What distinguishes our trust model is the introduction of a pre-evaluating set, with which we can reasonably filter out malicious ratings, effectively tune a rater's bias to cater to the current evaluator's criteria and scientifically weight a rater's rating in aggregation

[1]  Gene H. Golub,et al.  Exploiting the Block Structure of the Web for Computing , 2003 .

[2]  Luis Gravano,et al.  Modeling Query-Based Access to Text Databases , 2003, WebDB.

[3]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[4]  Petros Zerfos,et al.  Downloading textual hidden web content through keyword queries , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[5]  Gustavo Rossi,et al.  Designing personalized web applications , 2001, WWW '01.

[6]  Zhenyu Liu,et al.  Inferring Privacy Information from Social Networks , 2006, ISI.

[7]  David Geer,et al.  Malicious bots threaten network security , 2005, Computer.

[8]  Ernesto Damiani,et al.  A reputation-based approach for choosing reliable resources in peer-to-peer networks , 2002, CCS '02.

[9]  Matthew Richardson,et al.  Trust Management for the Semantic Web , 2003, SEMWEB.

[10]  D. Watts Networks, Dynamics, and the Small‐World Phenomenon1 , 1999, American Journal of Sociology.

[11]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[12]  Rajeev Motwani,et al.  Stratified Planning , 2009, IJCAI.

[13]  Xuejun Yang,et al.  A Behavior Characteristics-Based Reputation Evaluation Method for Grid Entities , 2005, EGC.

[14]  John Langford,et al.  CAPTCHA: Using Hard AI Problems for Security , 2003, EUROCRYPT.

[15]  Divyakant Agrawal,et al.  Scalable collection summarization and selection , 1999, DL '99.

[16]  Panagiotis Takis Metaxas,et al.  Web Spam, Propaganda and Trust , 2005, AIRWeb.

[17]  Hans-Peter Kriegel,et al.  Accurate and Efficient Crawling for Relevant Websites , 2004, VLDB.

[18]  Edith Cohen,et al.  Search and replication in unstructured peer-to-peer networks , 2002, ICS '02.

[19]  Malik Magdon-Ismail,et al.  Optimal Link Bombs are Uncoordinated , 2005, AIRWeb.

[20]  Sriram Raghavan,et al.  Crawling the Hidden Web , 2001, VLDB.

[21]  Chen-Nee Chuah,et al.  DoX: A Peer-to-Peer Antidote for DNS Cache Poisoning Attacks , 2006, 2006 IEEE International Conference on Communications.

[22]  Hector Garcia-Molina,et al.  Link spam detection based on mass estimation , 2006, VLDB.

[23]  Na Li,et al.  Detecting and filtering instant messaging spam - a global and personalized approach , 2005, 1st IEEE ICNP Workshop on Secure Network Protocols, 2005. (NPSec)..

[24]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[25]  Shlomo Moran,et al.  Predictive caching and prefetching of query results in search engines , 2003, WWW '03.

[26]  Ling Liu,et al.  A reputation-based trust model for peer-to-peer ecommerce communities , 2003, EC.

[27]  Xuxian Jiang,et al.  Automated Web Patrol with Strider HoneyMonkeys: Finding Web Sites That Exploit Browser Vulnerabilities , 2006, NDSS.

[28]  Jian-Yun Nie,et al.  An information retrieval model based on modal logic , 1989, Inf. Process. Manag..

[29]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[30]  Brian D. Davison Recognizing Nepotistic Links on the Web , 2000 .

[31]  Gordon V. Cormack,et al.  Spam and the ongoing battle for the inbox , 2007, CACM.

[32]  Ravi Kumar,et al.  Self-similarity in the web , 2001, TOIT.

[33]  Antonio Gulli,et al.  The indexable web is more than 11.5 billion pages , 2005, WWW '05.

[34]  M E J Newman,et al.  Identity and Search in Social Networks , 2002, Science.

[35]  Kevin S. McCurley,et al.  Ranking the web frontier , 2004, WWW '04.

[36]  Luis Gravano,et al.  When one sample is not enough: improving text database selection using shrinkage , 2004, SIGMOD '04.

[37]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[38]  Krishna Bharat,et al.  Who links to whom: mining linkage between Web sites , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[39]  E. Michael Maximilien,et al.  Toward autonomic web services trust and selection , 2004, ICSOC '04.

[40]  Mitesh Patel,et al.  Structured databases on the web: observations and implications , 2004, SGMD.

[41]  Hector Garcia-Molina,et al.  Improving search in peer-to-peer networks , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[42]  Susan B. Barnes,et al.  A privacy paradox: Social networking in the United States , 2006, First Monday.

[43]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[44]  Luis Gravano,et al.  Probe, count, and classify: categorizing hidden web databases , 2001, SIGMOD '01.

[45]  Wei-Ying Ma,et al.  Block-level link analysis , 2004, SIGIR '04.

[46]  Munindar P. Singh,et al.  An evidential model of distributed reputation management , 2002, AAMAS '02.

[47]  David Hawking,et al.  Methods for information server selection , 1999, TOIS.

[48]  Hao Chen,et al.  Spam double-funnel: connecting web spammers with advertisers , 2007, WWW '07.

[49]  Marti A. Hearst,et al.  Why phishing works , 2006, CHI.

[50]  Daniel Rocco,et al.  Exploiting the deep web with DynaBot: matching, probing, and ranking , 2005, WWW '05.

[51]  Luo Si,et al.  Using sampled data and regression to merge search engine results , 2002, SIGIR '02.

[52]  Clement T. Yu,et al.  Concept hierarchy based text database categorization in a metasearch engine environment , 2000, Proceedings of the First International Conference on Web Information Systems Engineering.

[53]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[54]  Ling Liu,et al.  PeerTrust: supporting reputation-based trust for peer-to-peer electronic communities , 2004, IEEE Transactions on Knowledge and Data Engineering.

[55]  Markus Lorch,et al.  Grid Community Characteristics and their Relation to Grid Security , 2003 .

[56]  Hector Garcia-Molina,et al.  Designing a super-peer network , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[57]  Stephen Hailes,et al.  Supporting trust in virtual communities , 2000, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

[58]  Norbert Fuhr,et al.  A decision-theoretic approach to database selection in networked IR , 1999, TOIS.

[59]  Jasmine Novak,et al.  PageRank Computation and the Structure of the Web: Experiments and Algorithms , 2002 .

[60]  J. Zittrain,et al.  Spam Works: Evidence from Stock Touts and Corresponding Market Activity , 2007 .

[61]  Yoram Singer,et al.  Learning to Query the Web , 1996 .

[62]  Marc Najork,et al.  Detecting spam web pages through content analysis , 2006, WWW '06.

[63]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[64]  K. Chang,et al.  Accessing the Deep Web : A Survey , 2005 .

[65]  Gilad Mishne,et al.  Blocking Blog Spam with Language Model Disagreement , 2005, AIRWeb.

[66]  Hector Garcia-Molina,et al.  The Eigentrust algorithm for reputation management in P2P networks , 2003, WWW '03.

[67]  Geoffrey Zweig,et al.  Syntactic Clustering of the Web , 1997, Comput. Networks.

[68]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[69]  Brian D. Davison,et al.  Topical TrustRank: using topicality to combat web spam , 2006, WWW '06.

[70]  Marc Najork,et al.  Detecting phrase-level duplication on the world wide web , 2005, SIGIR '05.

[71]  Tobias Scheffer,et al.  Thwarting the Nigritude Ultramarine: Learning to Identify Link Spam , 2005, ECML.

[72]  Brian D. Davison,et al.  Cloaking and Redirection: A Preliminary Study , 2005, AIRWeb.

[73]  James C. French,et al.  Comparing the performance of database selection algorithms , 1999, SIGIR '99.

[74]  Yi Liu,et al.  The powerrank web link analysis algorithm , 2004, WWW Alt. '04.

[75]  Dik Lun Lee,et al.  Server Ranking for Distributed Text Retrieval Systems on the Internet , 1997, DASFAA.

[76]  Hector Garcia-Molina,et al.  Link Spam Alliances , 2005, VLDB.

[77]  Jon M. Kleinberg,et al.  Group formation in large social networks: membership, growth, and evolution , 2006, KDD '06.

[78]  Ian T. Foster,et al.  SNAP: A Protocol for Negotiating Service Level Agreements and Coordinating Resource Management in Distributed Systems , 2002, JSSPP.

[79]  David J. DeWitt,et al.  Computing PageRank in a Distributed Internet Search Engine System , 2004, VLDB.

[80]  Kevin Chen-Chuan Chang,et al.  Understanding Web query interfaces: best-effort parsing with hidden syntax , 2004, SIGMOD '04.

[81]  James A. Hendler,et al.  A Framework for Web Science , 2006, Found. Trends Web Sci..

[82]  Luis Gravano,et al.  QProber: A system for automatic classification of hidden-Web databases , 2003, TOIS.

[83]  Thomas Beth,et al.  Valuation of Trust in Open Networks , 1994, ESORICS.

[84]  Hector Garcia-Molina,et al.  Finding replicated Web collections , 2000, SIGMOD '00.

[85]  James P. Callan,et al.  Query-based sampling of text databases , 2001, TOIS.

[86]  Frank McSherry,et al.  A uniform approach to accelerated PageRank computation , 2005, WWW '05.

[87]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[88]  James C. French,et al.  The impact of database selection on distributed searching , 2000, SIGIR '00.

[89]  James P. Callan,et al.  Automatic discovery of language models for text databases , 1999, SIGMOD '99.

[90]  Wendy Hall,et al.  Creating a Science of the Web , 2006, Science.

[91]  Wei-Ying Ma,et al.  Instance-based Schema Matching for Web Databases by Domain-specific Query Probing , 2004, VLDB.

[92]  Michael I. Jordan,et al.  Stable algorithms for link analysis , 2001, SIGIR '01.

[93]  Andrei Z. Broder,et al.  Efficient PageRank approximation via graph aggregation , 2004, WWW Alt. '04.

[94]  Sebastiano Vigna,et al.  Do Your Worst to Make the Best: Paradoxical Effects in PageRank Incremental Computations , 2004, WAW.

[95]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[96]  Munindar P. Singh,et al.  A Social Mechanism of Reputation Management in Electronic Communities , 2000, CIA.

[97]  Jasmine Novak,et al.  Geographic routing in social networks , 2005, Proc. Natl. Acad. Sci. USA.

[98]  John Riedl,et al.  Shilling recommender systems for fun and profit , 2004, WWW '04.

[99]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[100]  Luis Gravano,et al.  Generalizing GlOSS to Vector-Space Databases and Broker Hierarchies , 1995, VLDB.

[101]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[102]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[103]  Brian D. Davison Topical locality in the Web , 2000, SIGIR '00.

[104]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[105]  Tie-Yan Liu,et al.  Webpage importance analysis using conditional Markov random walk , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[106]  David Carmel,et al.  The connectivity sonar: detecting site functionality by structural patterns , 2003, HYPERTEXT '03.

[107]  Mike Thelwall,et al.  New versions of PageRank employing alternative Web document models , 2004, Aslib Proc..

[108]  Muthucumaru Maheswaran,et al.  Evolving and managing trust in grid computing systems , 2002, IEEE CCECE2002. Canadian Conference on Electrical and Computer Engineering. Conference Proceedings (Cat. No.02CH37373).

[109]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[110]  Hinrich Schütze,et al.  A Cooccurrence-Based Thesaurus and Two Applications to Information Retrieval , 1994, Inf. Process. Manag..

[111]  Cynthia Dwork,et al.  Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography , 2007, WWW '07.

[112]  Lada A. Adamic,et al.  How to search a social network , 2005, Soc. Networks.

[113]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[114]  Niels Provos,et al.  The Ghost in the Browser: Analysis of Web-based Malware , 2007, HotBots.

[115]  Ramanathan V. Guha,et al.  Propagation of trust and distrust , 2004, WWW '04.

[116]  Bobby Bhattacharjee,et al.  Using Trust in Recommender Systems: An Experimental Analysis , 2004, iTrust.

[117]  Hector Garcia-Molina,et al.  Routing indices for peer-to-peer systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[118]  Ronald Fagin,et al.  Comparing top k lists , 2003, SODA '03.

[119]  Brian D. Davison,et al.  Propagating Trust and Distrust to Demote Web Spam , 2006, MTW.

[120]  Clement T. Yu,et al.  An interactive clustering-based approach to integrating source query interfaces on the deep Web , 2004, SIGMOD '04.

[121]  Tao Qin,et al.  Microsoft Research Asia at Web Track and Terabyte Track of TREC 2004 , 2004, TREC.

[122]  Ernesto Damiani,et al.  Choosing reputable servents in a P2P network , 2002, WWW.

[123]  R. Monastersky The number that's devouring science , 2005 .

[124]  James P. Callan,et al.  Effective retrieval with distributed collections , 1998, SIGIR '98.

[125]  Danah Boyd,et al.  Friends, Friendsters, and Top 8: Writing community into being on social network sites , 2006, First Monday.

[126]  Hector Garcia-Molina,et al.  Secure Score Management in Peer-to-Peer Systems , 2003 .

[127]  Hector Garcia-Molina,et al.  Web Spam Taxonomy , 2005, AIRWeb.

[128]  D. Boyd,et al.  Social Network Sites: Public Private or What? , 2007 .

[129]  Oren Etzioni,et al.  Query routing for Web search engines: architecture and experiments , 2000, Comput. Networks.

[130]  Andrei Z. Broder,et al.  A Technique for Measuring the Relative Size and Overlap of Public Web Search Engines , 1998, Comput. Networks.

[131]  Feng Qiu,et al.  Automatic identification of user interest for personalized search , 2006, WWW '06.

[132]  Peter Bailey,et al.  Server selection on the World Wide Web , 2000, DL '00.

[133]  Steven D. Gribble,et al.  A Crawler-based Study of Spyware in the Web , 2006, NDSS.

[134]  Ricardo A. Baeza-Yates,et al.  Pagerank Increase under Different Collusion Topologies , 2005, AIRWeb.

[135]  Luis Gravano,et al.  GlOSS: text-source discovery over the Internet , 1999, TODS.

[136]  Luis Gravano,et al.  Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection , 2002, VLDB.

[137]  Ling Liu,et al.  Query routing in large-scale digital library systems , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[138]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[139]  Karl Aberer,et al.  Using SiteRank for P2P Web Retrieval , 2004 .

[140]  Hector Garcia-Molina,et al.  Taxonomy of trust: Categorizing P2P reputation systems , 2006, Comput. Networks.

[141]  Jie Lu,et al.  Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks , 2005, Workshop on Peer-to-Peer Information Retrieval.

[142]  James C. French,et al.  The Effects of Query-Based Sampling on Automatic Database Selection Algorithms , 2000 .

[143]  King-Lup Liu,et al.  Discovering the representative of a search engine , 2001, CIKM '01.

[144]  Aleksandar Kuzmanovic,et al.  Internet Cache Pollution Attacks and Countermeasures , 2006, Proceedings of the 2006 IEEE International Conference on Network Protocols.

[145]  Qiang Yang,et al.  Exploiting the hierarchical structure for link analysis , 2005, SIGIR '05.

[146]  Shlomo Moran,et al.  The stochastic approach for link-structure analysis (SALSA) and the TKC effect , 2000, Comput. Networks.

[147]  Kaizar Amin,et al.  Reputation-Based Grid Resource Selectionpdfauthor , 2003 .

[148]  Norbert Fuhr,et al.  Evaluating different methods of estimating retrieval quality for resource selection , 2003, SIGIR.

[149]  Luis Gravano,et al.  Modeling and managing content changes in text databases , 2005, 21st International Conference on Data Engineering (ICDE'05).

[150]  D. Watts,et al.  An Experimental Study of Search in Global Social Networks , 2003, Science.

[151]  King-Lup Liu,et al.  Determining Text Databases to Search in the Internet , 1998, VLDB.

[152]  Mudhakar Srivatsa,et al.  TrustGuard: countering vulnerabilities in reputation management for decentralized overlay networks , 2005, WWW '05.

[153]  Marc Najork,et al.  Spam, Damn Spam, and Statistics , 2004 .

[154]  Paolo Avesani,et al.  Controversial Users Demand Local Trust Metrics: An Experimental Study on Epinions.com Community , 2005, AAAI.

[155]  David F. Gleich,et al.  Fast Parallel PageRank: A Linear System Approach , 2004 .

[156]  Raghu Ramakrishnan,et al.  Community Information Management , 2006, IEEE Data Eng. Bull..

[157]  Marc Najork,et al.  On the evolution of clusters of near-duplicate Web pages , 2003, Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726).

[158]  David Hawking,et al.  Server selection methods in hybrid portal search , 2005, SIGIR '05.

[159]  King-Lup Liu,et al.  Detection of heterogeneities in a multiple text database environment , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[160]  Georgios Paliouras,et al.  Learning to Filter Unsolicited Commercial E-Mail , 2006 .

[161]  Brian D. Davison,et al.  Identifying link farm spam pages , 2005, WWW '05.

[162]  Jonathon T. Giffin,et al.  An auctioning reputation system based on anomaly , 2005, CCS '05.

[163]  András A. Benczúr,et al.  SpamRank -- Fully Automatic Link Spam Detection , 2005, AIRWeb.

[164]  Karl Aberer,et al.  Managing trust in a peer-2-peer information system , 2001, CIKM '01.