Feature space learning model

With the massive volume and rapid increasing of data, feature space study is of great importance. To avoid the complex training processes in deep learning models which project original feature space into low-dimensional ones, we propose a novel feature space learning (FSL) model. The main contributions in our approach are: (1) FSL can not only select useful features but also adaptively update feature values and span new feature spaces; (2) four FSL algorithms are proposed with the feature space updating procedure; (3) FSL can provide a better data understanding and learn descriptive and compact feature spaces without the tough training for deep architectures. Experimental results on benchmark data sets demonstrate that FSL-based algorithms performed better than the classical unsupervised, semi-supervised learning and even incremental semi-supervised algorithms. In addition, we show a visualization of the learned feature space results. With the carefully designed learning strategy, FSL dynamically disentangles explanatory factors, depresses the noise accumulation and semantic shift, and constructs easy-to-understand feature spaces.

[1]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Ido Leichter,et al.  Mean Shift Trackers with Cross-Bin Metrics , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Yanchun Liang,et al.  An incremental affinity propagation algorithm and its applications for text clustering , 2009, 2009 International Joint Conference on Neural Networks.

[4]  Zhi-Hua Zhou,et al.  New Semi-Supervised Classification Method Based on Modified Cluster Assumption , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[5]  P. Valkenburg,et al.  Framing European politics: a content analysis of press and television news , 2000 .

[6]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[7]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[8]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[9]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[10]  Maurizio Marchese,et al.  Text Clustering with Seeds Affinity Propagation , 2011, IEEE Transactions on Knowledge and Data Engineering.

[11]  Yoshua Bengio,et al.  Drawing and Recognizing Chinese Characters with Recurrent Neural Network , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Hui Xiong,et al.  Enhancing semi-supervised clustering: a feature projection perspective , 2007, KDD '07.

[13]  Yu Xue,et al.  Text classification based on deep belief network and softmax regression , 2016, Neural Computing and Applications.

[14]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[15]  Ran El-Yaniv,et al.  Distributional Word Clusters vs. Words for Text Categorization , 2003, J. Mach. Learn. Res..

[16]  Min Liu,et al.  Robust global motion estimation for video security based on improved k-means clustering , 2018, Journal of Ambient Intelligence and Humanized Computing.

[17]  Qiang Yang,et al.  Structural Regularized Support Vector Machine: A Framework for Structural Large Margin Classifier , 2011, IEEE Transactions on Neural Networks.

[18]  Filippo Menczer,et al.  Modeling Statistical Properties of Written Text , 2009, PloS one.

[19]  Zhi-Hua Zhou,et al.  Towards Making Unlabeled Data Never Hurt , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Arindam Banerjee,et al.  Semi-supervised Clustering by Seeding , 2002, ICML.

[21]  J. Bobadilla,et al.  Recommender systems survey , 2013, Knowl. Based Syst..

[22]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Sara Dadras,et al.  Adaptive rapid defect identification in ECPT based on K-means and automatic segmentation algorithm , 2018 .

[24]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[25]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[26]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[27]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[29]  Albert-László Barabási,et al.  The origin of bursts and heavy tails in human dynamics , 2005, Nature.