Toward Anonymizing IoT Data Streams via Partitioning

Internet-of-Things (IoT) devices are capable of capturing physiological measures, location and activity information, hence sharing sensed data can lead to privacy implications. Data anonymization provides solution to this problem, however, traditional anonymization approaches only provide privacy protection for data stream generated from a single entity. Since, a single entity can make use of multiple IoT devices at an instance, IoT data streams are not fixed in nature. As conventional data stream anonymization algorithms only work on fixed width data stream they cannot be applied to IoT. In this work, we propose an anonymization algorithm for publishing IoT data streams. Our approach anonymizes tuples with similar description in a single cluster under time based sliding window. It considers similarity of tuples when clustering, and provides solution to anonymize tuples with missing values using representative values. Our experiment on real dataset shows that the proposed algorithm publishes data with less information loss and runs faster compared to conventional anonymization approaches modified to run for IoT data streams.

[1]  Leif E. Peterson K-nearest neighbor , 2009, Scholarpedia.

[2]  Lei Zhao,et al.  B-CASTLE: An Efficient Publishing Algorithm for K-Anonymizing Data Streams , 2010, 2010 Second WRI Global Congress on Intelligent Systems.

[3]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[4]  Din J. Wasem,et al.  Mining of Massive Datasets , 2014 .

[5]  Jianzhong Li,et al.  Privacy protection on sliding window of data streams , 2007, 2007 International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2007).

[6]  Sylvia L. Osborn,et al.  FAANST: Fast Anonymizing Algorithm for Numerical Streaming DaTa , 2010, DPM/SETOP.

[7]  Kian-Lee Tan,et al.  CASTLE: Continuously Anonymizing Data Streams , 2011, IEEE Transactions on Dependable and Secure Computing.

[8]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[9]  Beng Chin Ooi,et al.  Anonymizing Streaming Data for Privacy Protection , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[10]  Arpan Pal,et al.  Challenges of Using Edge Devices in IoT Computation Grids , 2013, 2013 International Conference on Parallel and Distributed Systems.

[11]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[12]  Jianpei Zhang,et al.  KIDS:K-anonymization data stream base on sliding window , 2010, 2010 2nd International Conference on Future Computer and Communication.

[13]  Bin Jiang,et al.  Continuous privacy preserving publishing of data streams , 2009, EDBT '09.

[14]  Raymond Chi-Wing Wong,et al.  Anonymizing Temporal Data , 2010, 2010 IEEE International Conference on Data Mining.

[15]  Anand Rajaraman,et al.  Mining of Massive Datasets , 2011 .

[16]  Rasool Jalili,et al.  FAST: Fast Anonymization of Big Data Streams , 2014, BigDataScience '14.

[17]  Qishan Zhang,et al.  Fast clustering-based anonymization approaches with time constraints for data streams , 2013, Knowl. Based Syst..