Homeland security and privacy sensitive data mining from multi-party distributed resources

Defending the safety of an open society from terrorism or other similar threats requires intelligent but careful ways to monitor different types of activities and transactions in the electronic media. Data mining techniques are playing an increasingly important role in sifting through large amount of data in search of useful patterns that might help us in securing our safety. Although the objective of this class of data mining applications is very well justified, they also open up the possibility of misusing personal information by malicious people with access to the sensitive data. This brings up the following question: Can we design data mining techniques that are sensitive to privacy? Several researchers are currently working on a class of data mining algorithms that work without directly accessing the sensitive data in their original form. This paper considers the problem of mining distributed data in a privacy-sensitive manner. It first points out the problems of some of the existing privacy-sensitive data mining techniques that make use of additive random noise to hide sensitive information. Next it briefly reviews some new approaches that make use of random projection matrices for computing statistical aggregates from sensitive data.

[1]  Chris Clifton,et al.  Privacy-preserving distributed mining of association rules on horizontally partitioned data , 2004, IEEE Transactions on Knowledge and Data Engineering.

[2]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.