Real-time dynamic data desensitization method based on data stream

With the rapid development of the data mining industry, the value hidden in the massive data has been discovered, but at the same time it has also raised concerns about privacy leakage, leakage of sensitive data and other issues. These problems have also become numerous studies. Among the methods for solving these problems, data desensitization technology has been widely adopted for its outstanding performance. However, with the increasing scale of data and the increasing dimension of data, the traditional desensitization method for static data can no longer meet the requirements of various industries in today's environment to protect sensitive data. In the face of ever-changing data sets of scale and dimension, static desensitization technology relies on artificially designated desensitization rules to grasp the massive data, and it is difficult to control the loss of data connotation. In response to these problems, this paper proposes a real-time dynamic desensitization method based on data flow, and combines the data anonymization mechanism to optimize the data desensitization strategy. Experiments show that this method can efficiently and stably perform real-time desensitization of stream data, and can save more information to support data mining in the next steps.

[1]  Ashutosh Saxena,et al.  A neural network approach for data masking , 2011, Neurocomputing.

[2]  Deng Li,et al.  Network Fault Prediction Based on Regression Analysis Method , 2012 .

[3]  Rathindra Sarathy,et al.  Data Shuffling - A New Masking Approach for Numerical Data , 2006, Manag. Sci..

[4]  G. Manikandan,et al.  A few new approaches for data masking , 2015, 2015 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2015].

[5]  Laura Zayatz Data Masking for Disclosure Limitation , 2006 .

[6]  Huang Wei-qi Novel Algorithm on Electronic Medical Record Privacy Protection Against Background Knowledge Attack , 2012 .

[7]  S. Vijayarani,et al.  An efficient masking technique for sensitive data protection , 2011, 2011 International Conference on Recent Trends in Information Technology (ICRTIT).

[8]  Abdelkader H. Ouda,et al.  A content-based data masking technique for a built-in framework in Business Intelligence platform , 2017, 2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE).

[9]  Evangelos Pournaras,et al.  Optimization of Privacy-Utility Trade-offs under Informational Self-determination , 2020, Future Gener. Comput. Syst..

[10]  Jimeng Sun,et al.  Hiding in the Crowd: Privacy Preservation on Evolving Streams through Correlation Tracking , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  T. N. Manjunath,et al.  A Study on Dynamic Data Masking with its Trends and Implications , 2012 .

[12]  Anne Canteaut Linear Consistency Attack , 2005, Encyclopedia of Cryptography and Security.

[13]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[14]  Beng Chin Ooi,et al.  Anonymizing Streaming Data for Privacy Protection , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[15]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[16]  Jianpei Zhang,et al.  KIDS:K-anonymization data stream base on sliding window , 2010, 2010 2nd International Conference on Future Computer and Communication.