Application of density-based outlier detection to database activity monitoring

To prevent internal data leakage, database activity monitoring uses software agents to analyze protocol traffic over networks and to observe local database activities. However, the large size of data obtained from database activity monitoring has presented a significant barrier to effective monitoring and analysis of database activities. In this paper, we present database activity monitoring by means of a density-based outlier detection method and a commercial database activity monitoring solution. In order to provide efficient computing of outlier detection, we exploited a kd-tree index and an Approximated k-nearest neighbors (ANN) search method. By these means, the outlier computation time could be significantly reduced. The proposed methodology was successfully applied to a very large log dataset collected from the Korea Atomic Energy Research Institute (KAERI). The results showed that the proposed method can effectively detect outliers of database activities in a shorter computation time.