Graph-Based Abstraction for Privacy Preserving Manifold Visualization

With the next-generation Web aiming to further facilitate data/information sharing and aggregation, providing data privacy protection support in an open networked environments becomes increasingly important. Learning-from abstraction is a recently proposed distributed data mining approach which first abstracts data at local sources using the agglomerative hierarchical clustering (AGH) algorithm and then aggregates the abstractions (instead of the data) for global analysis. In this paper, we explain the limitation of the use of AGH for local manifold preserving data abstraction and propose the use of the graph-based clustering approach (e.g., the minimum cut) for local data abstraction. The effectiveness of the proposed abstraction approach was evaluated using benchmarking datasets with promising results. The global analysis results obtained based on the minimum cut abstraction was found to outperform those based on the AGH abstraction, especially when the underlying manifold was complex

[1]  Chris Clifton,et al.  Privacy-preserving k-means clustering over vertically partitioned data , 2003, KDD '03.

[2]  Xiaofeng Zhang,et al.  Visualizing global manifold based on distributed local data abstractions , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[3]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Yunghsiang Sam Han,et al.  Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification , 2004, SDM.

[6]  Xiaofeng Zhang,et al.  Learning Global Models Based on Distributed Data Abstractions , 2005, IJCAI.

[7]  Joydeep Ghosh,et al.  Privacy-preserving distributed clustering using generative models , 2003, Third IEEE International Conference on Data Mining.

[8]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.