A hybrid automatic image annotation approach

Automated image annotation (AIA) is an important issue in computer vision and pattern recognition, and plays an extremely important role in retrieving large-scale images. In many image annotation approaches, different regions of the image are processed equally, which is inconsistent with the mechanism by which humans understand images. In order to improve the annotation performance of existing AIA approaches, a hybrid AIA approach based on visual attention mechanism (VAM) and the conditional random field (CRF) is proposed. First, since people pay more attention to the salient region of an image during the image recognition process, VAM is implemented for acquiring the salient and non salient regions of the image. Second, support vector machine (SVM) is used to annotate the salient region, and k nearest neighbor (kNN) voting algorithm is used to annotate the non salient regions. Finally, due to the existence of a certain relationship between any two annotation words (also called labels), CRF is calculated to obtain the final label set of each given image. The experimental results confirm that the proposed hybrid AIA approach has ideal annotation performance.

[1]  Francisco Charte,et al.  MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation , 2015, Knowl. Based Syst..

[2]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[3]  Joemon M. Jose,et al.  Collections for Automatic Image Annotation and Photo Tag Recommendation , 2014, MMM.

[4]  Jing Liu,et al.  Image annotation via graph learning , 2009, Pattern Recognit..

[5]  Zahid Mehmood,et al.  Content-based image retrieval and semantic automatic image annotation based on the weighted average of triangular histograms using support vector machine , 2017, Applied Intelligence.

[6]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Duc Ngoc Tran,et al.  Human Activities Recognition in Android Smartphone Using Support Vector Machine , 2016, 2016 7th International Conference on Intelligent Systems, Modelling and Simulation (ISMS).

[8]  Luming Zhang,et al.  Action2Activity: Recognizing Complex Activities from Sensor Data , 2015, IJCAI.

[9]  Mária Bieliková,et al.  ANNOR: Efficient image annotation based on combining local and global features , 2015, Comput. Graph..

[10]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[11]  Guido Bologna,et al.  A Comparison Study on Rule Extraction from Neural Network Ensembles, Boosted Shallow Trees, and SVMs , 2018, Appl. Comput. Intell. Soft Comput..

[12]  Liming Zhang,et al.  A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[13]  Cong Jin,et al.  Image distance metric learning based on neighborhood sets for automatic image annotation , 2016, J. Vis. Commun. Image Represent..

[14]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[15]  Vladimir Pavlovic,et al.  Baselines for Image Annotation , 2010, International Journal of Computer Vision.

[16]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[17]  Shikha Agrawal,et al.  A Survey of Feature Extraction for Content-Based Image Retrieval System , 2018 .

[18]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[19]  Huchuan Lu,et al.  Saliency detection via a unified generative and discriminative model , 2016, Neurocomputing.

[20]  Yueting Zhuang,et al.  Stable multi-label boosting for image annotation with structural feature selection , 2011, Science China Information Sciences.

[21]  Toby P. Breckon,et al.  Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab , 2011 .

[22]  B Sudarshan,et al.  ANALYSIS OF IMAGE STORAGE AND RETRIEVAL IN GRADED MEMORY , 2015 .

[23]  Adel Mellit,et al.  Prediction of daily and mean monthly global solar radiation using support vector machine in an arid climate , 2016 .

[24]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[25]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[26]  Tansel Özyer,et al.  Complex networks driven salient region detection based on superpixel segmentation , 2017, Pattern Recognit..

[27]  Chun Qi,et al.  Salient region detection through sparse reconstruction and graph-based ranking , 2015, J. Vis. Commun. Image Represent..

[28]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[29]  David S. Rosenblum,et al.  From action to activity: Sensor-based activity recognition , 2016, Neurocomputing.

[30]  Cong Jin,et al.  Content-based image retrieval model based on cost sensitive learning , 2018, J. Vis. Commun. Image Represent..

[31]  Nizar Bouguila,et al.  Variational learning of a Dirichlet process of generalized Dirichlet distributions for simultaneous clustering and feature selection , 2013, Pattern Recognit..

[32]  Marco Loog Semi-supervised linear discriminant analysis through moment-constraint parameter estimation , 2014, Pattern Recognit. Lett..

[33]  Kilian Q. Weinberger,et al.  Fast Image Tagging , 2013, ICML.

[34]  Uwe Ohler,et al.  Automated annotation of gene expression image sequences via non-parametric factor analysis and conditional random fields , 2013, Bioinform..

[35]  C. V. Jawahar,et al.  Exploring SVM for Image Annotation in Presence of Confusing Labels , 2013, BMVC.

[36]  Luming Zhang,et al.  Fortune Teller: Predicting Your Career Path , 2016, AAAI.

[37]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, CVPR 2004.

[38]  Haroon Idrees,et al.  NMF-KNN: Image Annotation Using Weighted Multi-view Non-negative Matrix Factorization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Jia Chen,et al.  Integrating global and local visual features with semantic hierarchies for two-level image annotation , 2016, Neurocomputing.

[40]  Cong Jin,et al.  Automatic image annotation using feature selection based on improving quantum particle swarm optimization , 2015, Signal Process..

[41]  Xi Liu,et al.  Graph-based dimensionality reduction for KNN-based image annotation , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[42]  Nan Ma,et al.  Saliency Aggregation: Multifeature and Neighbor Based Salient Region Detection for Social Images , 2018, Appl. Comput. Intell. Soft Comput..

[43]  Hussein Chible,et al.  AUTOMATIC DE-NOISING FOR IMAGE ANNOTATION USING LATENT SEMANTIC ANALYSIS , 2014 .

[44]  中山 英樹 Linear distance metric learning for large-scale generic image recognition , 2011 .

[45]  Cong Jin,et al.  A Hybrid Model Based on Mutual Information and Support Vector Machine for Automatic Image Annotation , 2015, CSOC.

[46]  Li Liu,et al.  Recognizing Complex Activities by a Probabilistic Interval-Based Model , 2016, AAAI.

[47]  Victor Lavrenko,et al.  Sparse Kernel Learning for Image Annotation , 2014, ICMR.

[48]  Ping Ji,et al.  Automatic image annotation by combining generative and discriminant models , 2017, Neurocomputing.

[49]  C. V. Jawahar,et al.  Image Annotation by Propagating Labels from Semantic Neighbourhoods , 2016, International Journal of Computer Vision.

[50]  Maozhen Li,et al.  A MapReduce-based distributed SVM algorithm for automatic image annotation , 2011, Comput. Math. Appl..

[51]  Marco La Cascia,et al.  3D skeleton-based human action classification: A survey , 2016, Pattern Recognit..

[52]  Valquiria Aparecida Rosa Duarte,et al.  A multiagent player system composed by expert agents in specific game stages operating in high performance environment , 2017, Applied Intelligence.

[53]  Cong Jin,et al.  Content-Based Image Retrieval Based on Shape Similarity Calculation , 2017 .

[54]  Qi Tian,et al.  Image Annotation by Latent Community Detection and Multikernel Learning , 2015, IEEE Transactions on Image Processing.

[55]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[56]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[57]  Golshah Naghdy,et al.  Unsupervised Image Classification by Probabilistic Latent Semantic Analysis for the Annotation of Images , 2014, 2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[58]  Lei Zhang,et al.  Multi-label sparse coding for automatic image annotation , 2009, CVPR.

[59]  M. Omair Ahmad,et al.  Salient region detection using efficient wavelet-based textural feature maps , 2017, Multimedia Tools and Applications.

[60]  Cong Jin,et al.  Automatic Discovery Approach of Digital Image Topic , 2014 .

[61]  Cong Jin,et al.  A multi-label image annotation scheme based on improved SVM multiple kernel learning , 2017, International Conference on Graphic and Image Processing.