An enhanced secure preserving for pre-processed data using DMI and PCRBAC algorithm

Data preprocessing plays an important role in data mining for ensuring better quality of data. The data being extracted from raw data will contain impurities, noisy data which leads to inefficient data analysis, inaccurate decisions and user inconveniencies. Data pre-processing tasks involves identifying the outliers, cleaning of noisy data. In this paper we present a "Decision tree based Missing value Imputation technique" (DMI) which makes use of an EM algorithm and a decision tree (DT) algorithm. The result of pre-processed data should be secured using the privacy preserving techniques. The privacy of the data can be obtain by applying the cryptographic techniques which provides access to the stored data based on individual's roles that makes the data secure from the unauthorised access.

[1]  Robert P. Goldman,et al.  Imputation of Missing Data Using Machine Learning Techniques , 1996, KDD.

[2]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[3]  Vassilios S. Verykios,et al.  Disclosure limitation of sensitive rules , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[4]  Ehud Gudes,et al.  Privacy preserving Data Mining Algorithms without the use of Secure Computation or Perturbation , 2006, 2006 10th International Database Engineering and Applications Symposium (IDEAS'06).

[5]  N. Aarthi,et al.  Privacy Preserving Data Mining Using Cryptographic Role Based Access Control Approach , 2008 .

[6]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[7]  Elena Baralis,et al.  Data Cleaning and Semantic Improvement in Biological Databases , 2006, J. Integr. Bioinform..

[8]  Keun Ho Ryu,et al.  PRBAC: an extended role based access control for privacy preserving data mining , 2005, Fourth Annual ACIS International Conference on Computer and Information Science (ICIS'05).

[9]  Md Zahidul Islam,et al.  A Decision Tree-based Missing Value Imputation Technique for Data Pre-processing , 2011, AusDM.

[10]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[11]  Elena Baralis,et al.  Data Cleaning and Semantic Improvement in Biological Databases , 2006, J. Integr. Bioinform..

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  Chris Clifton,et al.  Defining Privacy for Data Mining , 2002 .