Duplicate Detection for Identifying Social Spam in Microblogs
暂无分享,去创建一个
Aoying Zhou | Weining Qian | Haixin Ma | Qunyan Zhang | Aoying Zhou | Weining Qian | Haixin Ma | Qunyan Zhang
[1] Ciro Cattuto,et al. Social spam detection , 2009, AIRWeb '09.
[2] Evaggelia Pitoura,et al. One is enough: distributed filtering for duplicate elimination , 2011, CIKM '11.
[3] Zhe Wang,et al. Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.
[4] Hosung Park,et al. What is Twitter, a social network or a news media? , 2010, WWW '10.
[5] Aoying Zhou,et al. Towards modeling popularity of microblogs , 2013, Frontiers of Computer Science.
[6] Aoying Zhou,et al. Social media data analysis for revealing collective behaviors , 2012, KDD.
[7] Stéphane Marchand-Maillet,et al. Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, Geneva, Switzerland, July 19-23, 2010 , 2010, SIGIR.
[8] Andreas Paepcke,et al. SpotSigs: robust and efficient near duplicate detection in large web collections , 2008, SIGIR '08.
[9] Wen-tau Yih,et al. Adaptive near-duplicate detection via similarity learning , 2010, SIGIR.
[10] Johan Bollen,et al. Twitter mood predicts the stock market , 2010, J. Comput. Sci..
[11] Geoffrey Zweig,et al. Syntactic Clustering of the Web , 1997, Comput. Networks.
[12] Karen Rose,et al. What is Twitter , 2009 .
[13] Jure Leskovec,et al. Inferring networks of diffusion and influence , 2010, KDD.
[14] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[15] Monika Henzinger,et al. Finding near-duplicate web pages: a large-scale evaluation of algorithms , 2006, SIGIR.
[16] Roberto J. Bayardo,et al. Scaling up all pairs similarity search , 2007, WWW '07.
[17] Ophir Frieder,et al. Collection statistics for fast duplicate document detection , 2002, TOIS.
[18] Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, United Kingdom, October 24-28, 2011 , 2011, CIKM.
[19] Hector Garcia-Molina,et al. Combating Web Spam with TrustRank , 2004, VLDB.
[20] Peter F. Patel-Schneider,et al. Proceedings of the 16th international conference on World Wide Web , 2007, WWW 2007.
[21] Harris Drucker,et al. Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.
[22] Bernardo A. Huberman,et al. What Trends in Chinese Social Media , 2011, ArXiv.
[23] Rina Panigrahy,et al. Entropy based nearest neighbor search in high dimensions , 2005, SODA '06.
[24] Zhe Wang,et al. Ferret: a toolkit for content-based similarity search of feature-rich data , 2006, EuroSys.
[25] Hisashi Koga,et al. Fast agglomerative hierarchical clustering algorithm using Locality-Sensitive Hashing , 2007, Knowledge and Information Systems.
[26] Jun Hu,et al. Detecting and characterizing social spam campaigns , 2010, CCS '10.
[27] Efstathios Stamatatos. Plagiarism detection based on structural information , 2011, CIKM '11.
[28] Din J. Wasem,et al. Mining of Massive Datasets , 2014 .
[29] Kyumin Lee,et al. Uncovering social spammers: social honeypots + machine learning , 2010, SIGIR.
[30] Yutaka Matsuo,et al. Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.
[31] Isabell M. Welpe,et al. Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.
[32] Rizal Setya Perdana. What is Twitter , 2013 .
[33] Patrick Paroubek,et al. Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.
[34] Xuanjing Huang,et al. Learning hash codes for efficient content reuse detection , 2012, SIGIR '12.
[35] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.
[36] Gurmeet Singh Manku,et al. Detecting near-duplicates for web crawling , 2007, WWW '07.
[37] Moses Charikar,et al. Similarity estimation techniques from rounding algorithms , 2002, STOC '02.
[38] Dmitri Loguinov,et al. Probabilistic near-duplicate detection using simhash , 2011, CIKM '11.
[39] Timothy W. Finin,et al. Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.
[40] Srinivasan Parthasarathy,et al. Bayesian Locality Sensitive Hashing for Fast Similarity Search , 2011, Proc. VLDB Endow..
[41] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.
[42] Timothy W. Finin,et al. Why We Twitter: An Analysis of a Microblogging Community , 2009, WebKDD/SNA-KDD.
[43] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).