From frequent itemsets to semantically meaningful visual patterns

Data mining techniques that are successful in transaction and text data may not be simply applied to image data that contain high-dimensional features and have spatial structures. It is not a trivial task to discover meaningful visual patterns in image databases, because the content variations and spatial dependency in the visual data greatly challenge most existing methods. This paper presents a novel approach to coping with these difficulties for mining meaningful visual patterns. Specifically, the novelty of this work lies in the following new contributions: (1) a principled solution to the discovery of meaningful itemsets based on frequent itemset mining; (2) a self-supervised clustering scheme of the high-dimensional visual features by feeding back discovered patterns to tune the similarity measure through metric learning; and (3) a pattern summarization method that deals with the measurement noises brought by the image data. The experimental results in the real images show that our method can discover semantically meaningful patterns efficiently and effectively.

[1]  Xin Zhang,et al.  Fast mining of spatial collocations , 2004, KDD.

[2]  Jiawei Han,et al.  Mining recurrent items in multimedia with progressive resolution refinement , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[3]  Luc Van Gool,et al.  Video mining with frequent itemset configurations , 2006 .

[4]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  Thomas S. Huang,et al.  Spatial pattern discovery by learning a probabilistic parametric model from multiple attributed relational graphs , 2004, Discret. Appl. Math..

[6]  Mong-Li Lee,et al.  Mining viewpoint patterns in image databases , 2003, KDD '03.

[7]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  Andrew Zisserman,et al.  Video data mining using configurations of viewpoint invariant regions , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[9]  Jiawei Han,et al.  Generating semantic annotations for frequent patterns with context analysis , 2006, KDD '06.

[10]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[11]  Jiawei Han,et al.  Extracting redundancy-aware top-k patterns , 2006, KDD '06.

[12]  Hui Xiong,et al.  Discovering colocation patterns from spatial data sets: a general approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[13]  Toon Calders,et al.  Depth-First Non-Derivable Itemset Mining , 2005, SDM.

[14]  Jiawei Han,et al.  Summarizing itemset patterns: a profile-based approach , 2005, KDD '05.

[15]  Hung-Khoon Tan,et al.  Common pattern discovery using earth mover's distance and local flow maximization , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[16]  B. S. Manjunath,et al.  Mining Image Datasets Using Perceptual Association Rules , 2003 .

[17]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[18]  Raymond J. Mooney,et al.  A probabilistic framework for semi-supervised clustering , 2004, KDD.

[19]  Gösta Grahne,et al.  Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[20]  Cheng Yang,et al.  Efficient discovery of error-tolerant frequent itemsets in high dimensions , 2001, KDD '01.

[21]  Andrew B. Nobel,et al.  Mining Approximate Frequent Itemsets In the Presence of Noise: Algorithm and Analysis , 2006, SDM.

[22]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[23]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[24]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Ming Yang,et al.  Discovery of Collocation Patterns: from Visual Words to Visual Phrases , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[27]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[28]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[29]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[30]  Jilles Vreeken,et al.  Item Sets that Compress , 2006, SDM.

[31]  Srinivasan Parthasarathy,et al.  Summarizing itemset patterns using probabilistic models , 2006, KDD '06.

[32]  Aristides Gionis,et al.  Approximating a collection of frequent sets , 2004, KDD.