Efficient Data Perturbation for Privacy Preserving and Accurate Data Stream Mining

The widespread use of the Internet of Things (IoT) has raised many concerns, including the protection of private information. Existing privacy preservation methods cannot provide a good balance between data utility and privacy, and also have problems with efficiency and scalability. This paper proposes an efficient data stream perturbation method (named as $P^2RoCAl$). $P^2RoCAl$ offers better data utility than similar methods: classification accuracies of $P^2RoCAl$ perturbed data streams are very close to those of the original data streams. $P^2RoCAl$ also provides higher resilience against data reconstruction attacks.

[1]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[2]  R. Mccall Fundamental Statistics for Behavioral Sciences , 1986 .

[3]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[4]  Cynthia Dwork,et al.  The Differential Privacy Frontier (Extended Abstract) , 2009, TCC.

[5]  Qi Wang,et al.  On the privacy preserving properties of random data perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[6]  Jun Tang,et al.  Privacy Loss in Apple's Implementation of Differential Privacy on MacOS 10.12 , 2017, ArXiv.

[7]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[8]  Philip S. Yu,et al.  Can the Utility of Anonymized Data be Used for Privacy Breaches? , 2009, TKDD.

[9]  Charu C. Aggarwal,et al.  On k-Anonymity and the Curse of Dimensionality , 2005, VLDB.

[10]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[11]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[12]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[13]  Philip S. Yu,et al.  On static and dynamic methods for condensation-based privacy-preserving data mining , 2008, TODS.

[14]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[15]  Ling Liu,et al.  A Random Rotation Perturbation Approach to Privacy Preserving Data Classification , 2005 .

[16]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[17]  Evangelos N. Gazis,et al.  Short Paper: IoT: Challenges, projects, architectures , 2015, 2015 18th International Conference on Intelligence in Next Generation Networks.

[18]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[20]  Yogesh L. Simmhan,et al.  Benchmarking Distributed Stream Processing Platforms for IoT Applications , 2016, TPCTC.

[21]  James Harland,et al.  Pacific Asia Conference on Information Systems ( PACIS ) 7-15-2012 μ-Fractal Based Data Perturbation Algorithm For Privacy Protection , 2013 .

[22]  Mohammad Abdur Razzaque,et al.  A comprehensive review on privacy preserving data mining , 2015, SpringerPlus.

[23]  Keke Chen,et al.  Under Consideration for Publication in Knowledge and Information Systems Geometric Data Perturbation for Privacy Preserving Outsourced Data Mining , 2010 .

[24]  Jianqing Zhang,et al.  Performance evaluation of Attribute-Based Encryption: Toward data privacy in the IoT , 2014, 2014 IEEE International Conference on Communications (ICC).

[25]  Sushil Jajodia,et al.  Information disclosure under realistic assumptions: privacy versus optimality , 2007, CCS '07.

[26]  Jian Pei,et al.  Privacy-Preserving Data Stream Classification , 2008, Privacy-Preserving Data Mining.

[27]  Aqeel-ur Rehman,et al.  Security and Privacy Issues in IoT , 2016, Int. J. Commun. Networks Inf. Secur..

[28]  Josep Domingo-Ferrer,et al.  Individual Differential Privacy: A Utility-Preserving Formulation of Differential Privacy Guarantees , 2016, IEEE Transactions on Information Forensics and Security.

[29]  Nikos Parlavantzas,et al.  Privacy Aware on-Demand Resource Provisioning for IoT Data Processing , 2015, IoT 360.

[30]  James Alan Fox,et al.  Randomized Response and Related Methods: Surveying Sensitive Data , 2015 .

[31]  F. Scholz Maximum Likelihood Estimation , 2006 .

[32]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[33]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[34]  Wentao Shang Challenges in IoT Networking via TCP / IP Architecture , 2016 .

[35]  Assaf Schuster,et al.  Data mining with differential privacy , 2010, KDD.

[36]  Huseyin Polat,et al.  A survey: deriving private information from perturbed data , 2015, Artificial Intelligence Review.

[37]  J. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research , 2015, Eur. J. Oper. Res..

[38]  Haider Banka,et al.  A Hamming distance based binary particle swarm optimization (HDBPSO) algorithm for high dimensional feature selection, classification and validation , 2015, Pattern Recognit. Lett..

[39]  Wenliang Du,et al.  Using randomized response techniques for privacy-preserving data mining , 2003, KDD '03.

[40]  Steven P. Reiss Practical Data-Swapping: The First Steps , 1980, 1980 IEEE Symposium on Security and Privacy.

[41]  Rathindra Sarathy,et al.  A General Additive Data Perturbation Method for Database Security , 1999 .

[42]  Yevgeni Koucheryavy,et al.  IoT Use Cases in Healthcare and Tourism , 2015, 2015 IEEE 17th Conference on Business Informatics.

[43]  Josep Domingo-Ferrer,et al.  Practical Data-Oriented Microaggregation for Statistical Disclosure Control , 2002, IEEE Trans. Knowl. Data Eng..

[44]  Jimeng Sun,et al.  Hiding in the Crowd: Privacy Preservation on Evolving Streams through Correlation Tracking , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[45]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[46]  Elisa Bertino,et al.  A Survey of Quantification of Privacy Preserving Data Mining Algorithms , 2008, Privacy-Preserving Data Mining.

[47]  N. B. Anuar,et al.  The rise of "big data" on cloud computing: Review and open research issues , 2015, Inf. Syst..

[48]  Keke Chen,et al.  Privacy preserving data classification with rotation perturbation , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[49]  Yin Yang,et al.  Differential privacy in data publication and analysis , 2012, SIGMOD Conference.

[50]  Issa Traoré,et al.  Privacy information in a positive credit system , 2017, Int. J. Grid Util. Comput..

[51]  Cynthia Dwork The Differential Privacy Frontier , 2009 .

[52]  W. Keller,et al.  Disclosure control of microdata , 1990 .

[53]  Pramod Viswanath,et al.  Extremal Mechanisms for Local Differential Privacy , 2014, J. Mach. Learn. Res..

[54]  陈永武 α , 1995 .

[55]  Latifur Khan,et al.  IoT Big Data Stream Mining , 2016, KDD.

[56]  Sung-Hyuk Cha Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions , 2007 .

[57]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[58]  Philip S. Yu,et al.  Differentially private data release for data mining , 2011, KDD.

[59]  J. Domingo-Ferrer,et al.  Steered Microaggregation: A Unified Primitive for Anonymization of Data Sets and Data Streams , 2017, 2017 IEEE International Conference on Data Mining Workshops (ICDMW).

[60]  Elisa Bertino Data privacy for IoT systems: Concepts, approaches, and research directions , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[61]  Qinghua Li,et al.  Achieving k-anonymity in privacy-aware location-based services , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[62]  Philip S. Yu,et al.  A Condensation Approach to Privacy Preserving Data Mining , 2004, EDBT.

[63]  Kian-Lee Tan,et al.  CASTLE: Continuously Anonymizing Data Streams , 2011, IEEE Transactions on Dependable and Secure Computing.

[64]  Joseph K. Liu,et al.  Toward efficient and privacy-preserving computing in big data era , 2014, IEEE Network.

[65]  Ljiljana Brankovic,et al.  Data Swapping: Balancing Privacy against Precision in Mining for Logic Rules , 1999, DaWaK.

[66]  Wenliang Du,et al.  Deriving private information from randomized data , 2005, SIGMOD '05.

[67]  Kato Mivule,et al.  A Comparative Analysis of Data Privacy and Utility Parameter Adjustment, Using Machine Learning Classification as a Gauge , 2013, Complex Adaptive Systems.