Max-Entropy Feed-Forward Clustering Neural Network

The outputs of non-linear feed-forward neural network are positive, which could be treated as probability when they are normalized to one. If we take Entropy-Based Principle into consideration, the outputs for each sample could be represented as the distribution of this sample for different clusters. Entropy-Based Principle is the principle with which we could estimate the unknown distribution under some limited conditions. As this paper defines two processes in Feed-Forward Neural Network, our limited condition is the abstracted features of samples which are worked out in the abstraction process. And the final outputs are the probability distribution for different clusters in the clustering process. As Entropy-Based Principle is considered into the feed-forward neural network, a clustering method is born. We have conducted some experiments on six open UCI datasets, comparing with a few baselines and applied purity as the measurement . The results illustrate that our method outperforms all the other baselines that are most popular clustering methods.

[1]  Hisashi Kashima,et al.  Clustering Crowds , 2013, AAAI.

[2]  Dinh Q. Phung,et al.  Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts , 2014, ICML.

[3]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[4]  Christian Böhm,et al.  Efficient Anytime Density-based Clustering , 2013, SDM.

[5]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[6]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[7]  Fei Sha,et al.  Demystifying Information-Theoretic Clustering , 2013, ICML.

[8]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[9]  Jane Yung-jen Hsu,et al.  Semantical Clustering of Morphologically Related Chinese Words , 2014, AAAI.

[10]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[11]  Yanjun Qi,et al.  Unsupervised Feature Learning by Deep Sparse Coding , 2013, SDM.

[12]  Ian Davidson,et al.  Formalizing Hierarchical Clustering as Integer Linear Programming , 2013, AAAI.

[13]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[14]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[15]  Rui Zhang,et al.  Evolutionary Soft Co-Clustering , 2013, SDM.

[16]  Carla E. Brodley,et al.  Discovering Better AAAI Keywords via Clustering with Community-Sourced Constraints , 2014, AAAI.

[17]  Philip S. Yu,et al.  A Survey of Uncertain Data Algorithms and Applications , 2009, IEEE Transactions on Knowledge and Data Engineering.

[18]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.