KHyperLogLog: Estimating Reidentifiability and Joinability of Large Data at Scale
暂无分享,去创建一个
Chao Li | Pern Hui Chia | Damien Desfontaines | Wei-Yen Day | Irippuge Milinda Perera | Qiushi Wang | Daniel Simmons-Marengo | Miguel Guevara | Damien Desfontaines | Daniel Simmons-Marengo | Qiushi Wang | Wei-Yen Day | Chao Li | Miguel Guevara
[1] Alon Y. Halevy,et al. Goods: Organizing Google's Datasets , 2016, SIGMOD Conference.
[2] Ninghui Li,et al. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.
[3] Saikat Guha,et al. Bootstrapping Privacy Compliance in Big Data Systems , 2014, 2014 IEEE Symposium on Security and Privacy.
[4] Vitaly Shmatikov,et al. Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).
[5] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[6] Shigang Chen,et al. Better with fewer bits: Improving the performance of cardinality estimation of large data streams , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.
[7] Peter Eckersley,et al. How Unique Is Your Web Browser? , 2010, Privacy Enhancing Technologies.
[8] Cynthia Dwork,et al. Differential Privacy , 2006, ICALP.
[9] ASHWIN MACHANAVAJJHALA,et al. L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).
[10] Bruce G. Lindsay,et al. Approximate medians and other quantiles in one pass and with limited memory , 1998, SIGMOD '98.
[11] Zoubin Ghahramani,et al. Learning from labeled and unlabeled data with label propagation , 2002 .
[12] Edo Liberty,et al. Optimal Quantile Approximation in Streams , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).
[13] Graham Cormode,et al. An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.
[14] P. Flajolet,et al. HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm , 2007 .
[15] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).
[16] Yufei Tao,et al. Anatomy: simple and effective privacy preservation , 2006, VLDB.
[17] J. Abowd,et al. Session 04a - The Decennial Census of Population and Housing , 2011 .
[18] Moses Charikar,et al. Finding frequent items in data streams , 2002, Theor. Comput. Sci..
[19] Yun William Yu,et al. HyperMinHash: Jaccard index sketching in LogLog space , 2017, ArXiv.
[20] Graham Cormode,et al. An Improved Data Stream Summary: The Count-Min Sketch and Its Applications , 2004, LATIN.
[21] Alexander Hall,et al. HyperLogLog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm , 2013, EDBT '13.
[22] Graham Cormode,et al. Space efficient mining of multigraph streams , 2005, PODS.
[23] Peter J. Haas,et al. On synopses for distinct-value estimation under multiset operations , 2007, SIGMOD '07.
[24] Latanya Sweeney,et al. k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..
[25] Divyakant Agrawal,et al. Efficient Computation of Frequent and Top-k Elements in Data Streams , 2005, ICDT.
[26] Moses Charikar,et al. Similarity estimation techniques from rounding algorithms , 2002, STOC '02.
[27] Philippe Flajolet,et al. Probabilistic counting , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).