Improved Density Peak Clustering Based on Information Entropy for Ancient Character Images

A large number of IoT applications require the use of supervised machine learning, a type of machine learning algorithm that requires data to be labeled before the model can be trained. Because manually labeling large datasets is a time-consuming, error-prone, and expensive task, automated machine learning methods can be used. To tackle the challenge in which an ancient character image needs to be manually labeled, this paper explores the classification method of ancient Chinese character images based on density peak clustering. We design a metric function of density peak clustering and propose an improved density peak clustering method based on information entropy for ancient book image classification. The method enumerates the distance threshold of clustering, then calculates the information entropy of the clustering result, and determines the class distance threshold by analyzing the attenuation of the information entropy, thereby completing the image clustering process. The improved metric function is used to calculate the similarity between images. A greedy strategy is used as the basis of the merging operation of the class members to achieve the purpose of increasing the degree of information entropy attenuation. The experimental results on the dataset of the Yi character images prove that the method can accurately classify unknown character images of ancient books.

[1]  Zibin Zheng,et al.  Location-Based Hierarchical Matrix Factorization for Web Service Recommendation , 2014, 2014 IEEE International Conference on Web Services.

[2]  Allan Hanbury,et al.  A survey of methods for image annotation , 2008, J. Vis. Lang. Comput..

[3]  Salvatore Tabbone,et al.  Images Annotation Extension Based on User Feedback , 2017, ACIVS.

[4]  Shuliang Wang,et al.  Comment on "Clustering by fast search and find of density peaks" , 2015, ArXiv.

[5]  Y. Mori,et al.  Image-to-word transformation based on dividing and vector quantizing images with words , 1999 .

[6]  Chenliang Xu,et al.  Watch What You Just Said: Image Captioning with Text-Conditional Attention , 2016, ACM Multimedia.

[7]  Raimondo Schettini,et al.  Image annotation using SVM , 2003, IS&T/SPIE Electronic Imaging.

[8]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[9]  Saeed El-Ashram,et al.  Clustering by fast search and merge of local density peaks for gene expression microarray data , 2017, Scientific Reports.

[10]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[12]  Hassan Foroosh,et al.  Learning Semantics for Image Annotation , 2017, ArXiv.

[13]  Lilan Liu,et al.  Automated Quantitative Verification for Service-Based System Design: A Visualization Transform Tool Perspective , 2018, Int. J. Softw. Eng. Knowl. Eng..

[14]  Antonio Torralba,et al.  LabelMe: Online Image Annotation and Applications , 2010, Proceedings of the IEEE.

[15]  Wei Xu,et al.  CNN-RNN: A Unified Framework for Multi-label Image Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Paolo Napoletano,et al.  IAT - Image Annotation Tool: Manual , 2015, ArXiv.

[17]  Alberto Del Bimbo,et al.  Socializing the Semantic Gap , 2015, ACM Comput. Surv..

[18]  Geun-Duk Park,et al.  Linked tag: image annotation using semantic relationships between image tags , 2014, Multimedia Tools and Applications.

[19]  Yueshen Xu,et al.  QoS Prediction for Service Recommendation with Deep Feature Learning in Edge Computing Environment , 2019, Mob. Networks Appl..

[20]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[22]  Junhao Wen,et al.  A Location and Reputation Aware Matrix Factorization Approach for Personalized Quality of Service Prediction , 2017, 2017 IEEE International Conference on Web Services (ICWS).

[23]  Chunhua Shen,et al.  What Value Do Explicit High Level Concepts Have in Vision to Language Problems? , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Shalini S Singh,et al.  K-means v/s K-medoids: A Comparative Study , 2011 .

[25]  Florent Perronnin,et al.  Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Michael D. Buhrmester,et al.  Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[27]  Yucong Duan,et al.  Toward service selection for workflow reconfiguration: An interface-based computing solution , 2018, Future Gener. Comput. Syst..

[28]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[29]  Shuliang Wang,et al.  Clustering by Fast Search and Find of Density Peaks with Data Field , 2016 .

[30]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[31]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[32]  Jing Hua,et al.  Region-based Image Annotation using Asymmetrical Support Vector Machine-based Multiple-Instance Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[34]  Yi Yang,et al.  Web and Personal Image Annotation by Mining Label Correlation With Relaxed Visual Graph Embedding , 2012, IEEE Transactions on Image Processing.

[35]  Md. Monirul Islam,et al.  A review on automatic image annotation techniques , 2012, Pattern Recognit..

[36]  Jian Wan,et al.  Location-Aware Service Recommendation With Enhanced Probabilistic Matrix Factorization , 2018, IEEE Access.

[37]  Chih-Fong Tsai,et al.  Bag-of-Words Representation in Image Annotation: A Review , 2012 .

[38]  Kang Zhang,et al.  Applying improved particle swarm optimization for dynamic service composition focusing on quality of service evaluations under hybrid networks , 2018, Int. J. Distributed Sens. Networks.

[39]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Pavel Berkhin,et al.  A Survey of Clustering Data Mining Techniques , 2006, Grouping Multidimensional Data.

[41]  Alexander Hinneburg,et al.  DENCLUE 2.0: Fast Clustering Based on Kernel Density Estimation , 2007, IDA.

[42]  Peng Bi,et al.  Handbook of Linguistic Annotation , 2018, J. Quant. Linguistics.

[43]  Subhransu Maji,et al.  Automatic Image Annotation using Deep Learning Representations , 2015, ICMR.

[44]  Yueshen Xu,et al.  Collaborative Service Selection via Ensemble Learning in Mixed Mobile Network Environments , 2017, Entropy.