The explosions of new data mining techniques has augmented privacy risks because now it is probable to powerfully coalesce and cross-examine massive data stores, accessible on the web, in the rummage around of earlier unidentified hidden patterns. Consecutively to make a overtly accessible system safe and sound, we must guarantee not only that private sensitive data have been trimmed out, but also to make certain that certain inference channels have been clogged-up. The data and the concealed knowledge in this data should be made secure. Furthermore, the requirement for making our system as open as probable - to the extent that data sensitivity is not jeopardized - asks for diverse techniques that account for the revelation organize of sensitive data. At its nucleus, the value of privacy preserving data mining is plagiaristic not only from its knack to haul out imperative knowledge, but also from its resiliency to molestation. It performs well at needed levels during times of both crisis and normal operations. This task force's central thrust is towards establishing a earth with robust data security, where knowledge users persist to profit from data without compromising the data privacy.The goal of privacy-preserving data mining is to liberate a dataset that researchers can study without being able to identify sensitive information about any individuals in the data (with high probability). One technique for privacy-preserving data mining is to replace the sensitive items by unknown values. For many situations it is safer if the sanitization process consign unknown values as a substitute of fake values. This obscures the susceptible rules, whilst defending the punter of the data commencing false rules. In this study, we modify the blocking algorithms of[1] by proposing a new heuristic in order to reduce the information loss. We put forward an enhanced approach that overcomes the privacy breach problem of existing blocking approaches. Though they have argued that the rules are truly safe from an attack by an adversary, they have not formally proved the safety, which we have proved. We have investigated how probabilistic and information theoretic techniques can be applied to this problem. More complete analysis of the effectiveness of these rule obscuring techniques, and formal study of the problem has been made. Our preliminary domino effect point toward deterministic algorithms for privacy preserving association rules shows potential framework for controlling disclosure of sensitive data and knowledge.
[1]
William Yurcik,et al.
Sharing computer network logs for security and privacy: a motivation for new methodologies of anonymization
,
2005,
Workshop of the 1st International Conference on Security and Privacy for Emerging Areas in Communication Networks, 2005..
[2]
Osmar R. Zaïane,et al.
Protecting sensitive knowledge by data sanitization
,
2003,
Third IEEE International Conference on Data Mining.
[3]
Karl N. Levitt,et al.
How to sanitize data?
,
2004,
13th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises.
[4]
Anton Stiglic,et al.
Traffic Analysis Attacks and Trade-Offs in Anonymity Providing Systems
,
2001,
Information Hiding.
[5]
Rakesh Agrawal,et al.
Privacy-preserving data mining
,
2000,
SIGMOD 2000.
[6]
Mary K. Vernon,et al.
Mapping Internet Sensors with Probe Response Attacks
,
2005,
USENIX Security Symposium.
[7]
T. Kohno,et al.
Remote physical device fingerprinting
,
2005,
2005 IEEE Symposium on Security and Privacy (S&P'05).
[8]
Elisa Bertino,et al.
Association rule hiding
,
2004,
IEEE Transactions on Knowledge and Data Engineering.
[9]
Vitaly Shmatikov,et al.
Privacy-Preserving Sharing and Correlation of Security Alerts
,
2004,
USENIX Security Symposium.
[10]
M.E. Locasto,et al.
Towards collaborative security and P2P intrusion detection
,
2005,
Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop.
[11]
Vern Paxson,et al.
A high-level programming environment for packet trace anonymization and transformation
,
2003,
SIGCOMM '03.
[12]
Vitaly Shmatikov,et al.
Large-scale collection and sanitization of network security data: risks and challenges
,
2006,
NSPW '06.
[13]
Chris Clifton,et al.
Using unknowns to prevent discovery of association rules
,
2001,
SGMD.
[14]
Peng Ning,et al.
Privacy-preserving alert correlation: a concept hierarchy based approach
,
2005,
21st Annual Computer Security Applications Conference (ACSAC'05).
[15]
Albert G. Greenberg,et al.
Structure preserving anonymization of router configuration data
,
2004,
IEEE Journal on Selected Areas in Communications.