Incorporating Deep Visual Features into Multiobjective based Multi-view Search Results Clustering

Current paper explores the use of multi-view learning for search result clustering. A web-snippet can be represented using multiple views. Apart from textual view cued by both the semantic and syntactic information, a complementary view extracted from images contained in the websnippets is also utilized in the current framework. A single consensus partitioning is finally obtained after consulting these two individual views by the deployment of a multi-objective based clustering technique. Several objective functions including the values of a cluster quality measure evaluating the goodness of partitionings obtained using different views and an agreementdisagreement index, quantifying the amount of oneness among multiple views in generating partitionings are optimized simultaneously using AMOSA. In order to detect the number of clusters automatically, concepts of variable length solutions and a vast range of permutation operators are introduced in the clustering process. Finally a set of alternative partitionings are obtained on the final Pareto front by the proposed multi-view based multi-objective technique. Experimental results by the proposed approach on several bench-mark test datasets with respect to different performance metrics evidently establish the power of visual and text based views in achieving better search result clustering.

[1]  Erik Cambria,et al.  Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[2]  Gaël Dias,et al.  Easy Web Search Results Clustering: When Baselines Can Reach State-of-the-Art Algorithms , 2014, EACL.

[3]  Shiliang Sun,et al.  Multi-view clustering ensembles , 2013, 2013 International Conference on Machine Learning and Cybernetics.

[4]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[5]  Xiaoying Gao,et al.  Improving Web clustering by cluster selection , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[6]  Guillaume Cleuziou,et al.  Post-Retrieval Clustering Using Third-Order Similarity Measures , 2013, ACL.

[7]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[8]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[9]  Khalil Sima'an,et al.  A Shared Task on Multimodal Machine Translation and Crosslingual Image Description , 2016, WMT.

[10]  Ujjwal Maulik,et al.  A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA , 2008, IEEE Transactions on Evolutionary Computation.

[11]  Dawid Weiss,et al.  A survey of Web clustering engines , 2009, CSUR.

[12]  Hal Daumé,et al.  A Co-training Approach for Multi-view Spectral Clustering , 2011, ICML.

[13]  Julio Gonzalo,et al.  A general evaluation measure for document organization tasks , 2013, SIGIR.

[14]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[15]  Roberto Navigli,et al.  Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction , 2013, CL.

[16]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[17]  Aristidis Likas,et al.  Convex Mixture Models for Multi-view Clustering , 2009, ICANN.

[18]  Shiliang Sun,et al.  A survey of multi-view machine learning , 2013, Neural Computing and Applications.

[19]  Feiping Nie,et al.  Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Multi-View K-Means Clustering on Big Data , 2022 .

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2009, Information Retrieval.

[22]  Dawid Weiss,et al.  A concept-driven algorithm for clustering search results , 2005, IEEE Intelligent Systems.

[23]  Jiebo Luo,et al.  Image Captioning with Semantic Attention , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Louis-Philippe Morency,et al.  Combating Human Trafficking with Multimodal Deep Models , 2017, ACL.

[25]  Cornelia Caragea,et al.  Content-Driven Detection of Cyberbullying on the Instagram Social Network , 2016, IJCAI.

[26]  V. D. Sa Spectral Clustering with Two Views , 2007 .

[27]  Xiaoying Gao,et al.  Multi-view clustering of web documents using multi-objective genetic algorithm , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[28]  Sudipta Acharya,et al.  Multi-Objective Search Results Clustering , 2014, COLING.

[29]  Ujjwal Maulik,et al.  Multiobjective Genetic Algorithms for Clustering - Applications in Data Mining and Bioinformatics , 2011 .

[30]  Andrea Marino,et al.  Topical clustering of search results , 2012, WSDM '12.

[31]  Claudio Carpineto,et al.  Optimal meta search results clustering , 2010, SIGIR.

[32]  Aristidis Likas,et al.  Kernel-Based Weighted Multi-view Clustering , 2012, 2012 IEEE 12th International Conference on Data Mining.

[33]  Margaret Mitchell,et al.  VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[34]  Roberto Navigli,et al.  Inducing Word Senses to Improve Web Search Result Clustering , 2010, EMNLP.

[35]  M. Cugmas,et al.  On comparing partitions , 2015 .