Using an SQP Algorithm to Choose the Feature Set for Data Field Clustering

Data field clustering, which is enlightened by the field in physical space, is one of the new perspectives in clustering. By simulating mutual attraction and opposite movements to group data objects, data field clustering has many advantages compared with other traditional methods. While the implementation of data field clustering has not been complete until now, one of the important problems is that the performance of clustering algorithms greatly depends on the number of initial data objects. Inspired by a key point approximate representation of a time series, we define the key data object and feature set for data fields in this paper, and we propose an algorithm for seeking the key data object based on a sequential quadratic programming (SQP) algorithm. Experimental results show that an SQP algorithm outperforms current competitors, such as the Lagrangian multiplier algorithm, the quadratic programming with constraints algorithm, and the unconstrained programming algorithm in choosing the feature set. Besides, the results verify that the data field from the feature subset can well approach that of the original data set with fewer data objects. This fact can ensure that the performance of the data field clustering algorithm will be significantly improved with a big data set.