Multi-Dimensional Inference and Confidential Data Protection with Decision Tree Methods

Abstract : We present a novel approach to the challenging issue of database confidential data protection. We adopt the decision tree framework as our baseline and extend it to cope with databases where the class label attribute is not specified. We are interested in confidential data that are randomly distributed over different attributes (referred to as multi-dimensional inference). For confidential data protection, our method (referred to as adaptive modification) mitigates inference by evaluating and modifying some, not all, relevant data records. We localize data modification in a decision tree and, instead of exhaustively evaluating all modification possibilities, we select informative data to modify. Our proposed method is effective in protection of confidential data and scalable for handling large databases.