Top-down itemset recoding for releasing private complex data

Complex data, which has single-valued attributes and set-Valued attributes, enables us to associate these attribute values and analyze these relationships. Before releasing such complex data, ensuring anonymity for these data owners should be required. However, existing data anonymization methods are not work well because they only assume that quasi-identifiers are either multidimensional single-valued attributes or one set-value attribute. This paper proposes an anonymization method which integrates recodings for single-valued attributes and set-valued attributes into a whole top-down anonymization. Especially, in order to integrate recoding for set-valued attributes, this paper also proposes top-down itemset recoding which follows top-down manner and does not obfuscate items to ensure k-anonymity. In the experiment part, using real dataset, we clarify characteristics and effectiveness of proposed method. This method does not require the generalization hierarchy for set-valued attributes, thus the anonymized itemsets are not obfuscated and can be analyzed by standard data mining tools.

[1]  Philip S. Yu,et al.  Top-down specialization for information and privacy preservation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[2]  Panos Kalnis,et al.  Privacy-preserving anonymization of set-valued data , 2008, Proc. VLDB Endow..

[3]  Jeffrey F. Naughton,et al.  Anonymization of Set-Valued Data via Top-Down, Local Generalization , 2009, Proc. VLDB Endow..

[4]  Ke Wang,et al.  Anonymizing Transaction Data by Integrating Suppression and Generalization , 2010, PAKDD.

[5]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[6]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[7]  Philip S. Yu,et al.  Anonymizing transaction databases for publication , 2008, KDD.

[8]  Jian Pei,et al.  Utility-based anonymization using local recoding , 2006, KDD '06.

[9]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[10]  Carla E. Brodley,et al.  KDD-Cup 2000 organizers' report: peeling the onion , 2000, SKDD.