Flickr Image Community Analytics by Deep Noise-Refined Matrix Factorization

Accurately categorizing Flickr images into multiple pre-defined communities (e.g., “architecture” and “peaceful”) is an indispensable technique in multimedia analysis, graphic design, fashion recommendation, etc. In practice, these communities are constructed and updated manually, which is subjective and intolerably time consuming. To alleviate these shortcomings, a noise-refined deep matrix factorization (MF) framework is proposed to intelligently discover communities from million-scale Flickr users, wherein the semantic tag correlations and community correlations are simultaneously encoded. More specifically, it is believable that Flickr communities are high-level clues on the basis of human visual semantic perception. Thereby, a MF algorithm is employed to approximate the community label matrix by the product of pairwise factor matrices, which represent the latent representations of user-provided tags and the corresponding basis matrix respectively. Subsequently, an end-to-end deep model is formulated to hierarchically derive the latent deep representation from raw image pixels to semantic tags. To robustly handle contaminated image semantic tags and community labels, an $l_1$ norm constraint is encoded to enhance the MF. Meanwhile, to optimally exploit the rich context information of Flickr images, the intrinsic structure between image semantic tags and between communities are collaboratively captured. Finally, the upgraded MF and the deep model are seamlessly combined into a unified framework, which is solved by an iterative algorithm. Experiments on 2 M Flickr images have demonstrated the superiority of our approach. Besides, the discovered Flickr communities can improve photo retargeting and visual aesthetics assessment significantly.

[1]  Arya Mazumdar,et al.  Clustering with Noisy Queries , 2017, NIPS.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Tanaya Guha,et al.  Unsupervised Discovery of Character Dictionaries in Animation Movies , 2018, IEEE Transactions on Multimedia.

[4]  Trevor Darrell,et al.  Recognizing Image Style , 2013, BMVC.

[5]  F. Radicchi,et al.  Benchmark graphs for testing community detection algorithms. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  Jianyong Wang,et al.  Parallel community detection on large networks with propinquity dynamics , 2009, KDD.

[7]  Meng Wang,et al.  Biologically Inspired Media Quality Modeling , 2015, ACM Multimedia.

[8]  Steve Gregory,et al.  A Fast Algorithm to Find Overlapping Communities in Networks , 2008, ECML/PKDD.

[9]  Yan Ke,et al.  The Design of High-Level Features for Photo Quality Assessment , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Radomír Mech,et al.  Deep Multi-patch Aggregation Network for Image Style, Aesthetics, and Quality Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Duy-Dinh Le,et al.  Visual Analytics of Political Networks From Face-Tracking of News Video , 2016, IEEE Transactions on Multimedia.

[12]  Lei Chen,et al.  Online Modeling of Esthetic Communities Using Deep Perception Graph Analytics , 2018, IEEE Transactions on Multimedia.

[13]  Takeo Kanade,et al.  Robust L/sub 1/ norm factorization in the presence of outliers and missing data by alternative convex programming , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Ariel Shamir,et al.  Improved seam carving for video retargeting , 2008, ACM Trans. Graph..

[16]  Bingbing Ni,et al.  Learning to photograph , 2010, ACM Multimedia.

[17]  Philip S. Yu,et al.  Hierarchical, Parameter-Free Community Discovery , 2008, ECML/PKDD.

[18]  H. W. Kuhn B R Y N Mawr College Variants of the Hungarian Method for Assignment Problems' , 1955 .

[19]  Vicente Ordonez,et al.  High level describable attributes for predicting aesthetics and interestingness , 2011, CVPR 2011.

[20]  Dacheng Tao,et al.  On the Performance of Manhattan Nonnegative Matrix Factorization , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Olga Sorkine-Hornung,et al.  A comparative study of image retargeting , 2010, ACM Trans. Graph..

[22]  Xiaogang Wang,et al.  Content-based photo quality assessment , 2011, 2011 International Conference on Computer Vision.

[23]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[24]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[25]  Luming Zhang,et al.  Unified Photo Enhancement by Discovering Aesthetic Communities From Flickr , 2016, IEEE Transactions on Image Processing.

[26]  Hailin Jin,et al.  Composition-Preserving Deep Photo Aesthetics Assessment , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Thomas S. Huang,et al.  Brain-Inspired Deep Networks for Image Aesthetics Assessment , 2016, ArXiv.

[28]  Junji Yamato,et al.  Collective First-Person Vision for Automatic Gaze Analysis in Multiparty Conversations , 2017, IEEE Transactions on Multimedia.

[29]  James Ze Wang,et al.  Studying Aesthetics in Photographic Images Using a Computational Approach , 2006, ECCV.

[30]  Joachim M. Buhmann,et al.  Multi-assignment clustering for Boolean data , 2009, ICML '09.

[31]  Yun Chi,et al.  Combining link and content for community detection: a discriminative approach , 2009, KDD.

[32]  Jinhui Tang,et al.  Deep Matrix Factorization for social image tag refinement and assignment , 2015, 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP).

[33]  Jian Shi,et al.  Image Retargeting Using Mesh Parametrization , 2009, IEEE Transactions on Multimedia.

[34]  Mohammed Bennamoun,et al.  A Joint Deep Boltzmann Machine (jDBM) Model for Person Identification Using Mobile Phone Data , 2017, IEEE Transactions on Multimedia.

[35]  Sune Lehmann,et al.  Link communities reveal multiscale complexity in networks , 2009, Nature.

[36]  James Zijun Wang,et al.  RAPID: Rating Pictorial Aesthetics using Deep Learning , 2014, ACM Multimedia.

[37]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[38]  Hadi Hadizadeh,et al.  Full-Reference Objective Quality Assessment of Tone-Mapped Images , 2017, IEEE Transactions on Multimedia.

[39]  B. S. Manjunath,et al.  Context-Aware Hypergraph Modeling for Re-identification and Summarization , 2016, IEEE Transactions on Multimedia.

[40]  George Trigeorgis,et al.  A Deep Matrix Factorization Method for Learning Attribute Representations , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Roger Zimmermann,et al.  Flickr Circles: Aesthetic Tendency Discovery by Multi-View Regularized Topic Modeling , 2016, IEEE Transactions on Multimedia.

[42]  Shuang Ma,et al.  A-Lamp: Adaptive Layout-Aware Multi-patch Deep Convolutional Neural Network for Photo Aesthetic Assessment , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Xiao Liu,et al.  Probabilistic Graphlet Transfer for Photo Cropping , 2013, IEEE Transactions on Image Processing.

[44]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[45]  Tetsuya Yoshida,et al.  Toward finding hidden communities based on user profile , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[46]  Flemming Topsøe,et al.  Jensen-Shannon divergence and Hilbert space embedding , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[47]  Gabriela Csurka,et al.  Assessing the aesthetic quality of photographs using generic image descriptors , 2011, 2011 International Conference on Computer Vision.

[48]  Moncef Gabbouj,et al.  Spatiotemporal Saliency Estimation by Spectral Foreground Detection , 2018, IEEE Transactions on Multimedia.

[49]  Yoichi Sato,et al.  Sensation-based photo cropping , 2009, ACM Multimedia.

[50]  Naila Murray,et al.  Discovering Beautiful Attributes for Aesthetic Image Analysis , 2014, International Journal of Computer Vision.

[51]  Hujun Bao,et al.  Understanding the Power of Clause Learning , 2009, IJCAI.

[52]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[53]  Silvio Lattanzi,et al.  Affinity Clustering: Hierarchical Clustering at Scale , 2017, NIPS.

[54]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[55]  Jure Leskovec,et al.  Community-Affiliation Graph Model for Overlapping Network Community Detection , 2012, 2012 IEEE 12th International Conference on Data Mining.

[56]  Ran He,et al.  Deep Aesthetic Quality Assessment With Semantic Information , 2016, IEEE Transactions on Image Processing.