Towards cross-category knowledge propagation for learning visual concepts

In recent years, knowledge transfer algorithms have become one of most the active research areas in learning visual concepts. Most of the existing learning algorithms focuses on leveraging the knowledge transfer process which is specific to a given category. However, in many cases, such a process may not be very effective when a particular target category has very few samples. In such cases, it is interesting to examine, whether it is feasible to use cross-category knowledge for improving the learning process by exploring the knowledge in correlated categories. Such a task can be quite challenging due to variations in semantic similarities and differences between categories, which could either help or hinder the cross-category learning process. In order to address this challenge, we develop a cross-category label propagation algorithm, which can directly propagate the inter-category knowledge at instance level between the source and the target categories. Furthermore, this algorithm can automatically detect conditions under which the transfer process can be detrimental to the learning process. This provides us a way to know when the transfer of cross-category knowledge is both useful and desirable. We present experimental results on real image and video data sets in order to demonstrate the effectiveness of our approach.

[1]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Rong Yan,et al.  Model-shared subspace boosting for multi-label classification , 2007, KDD '07.

[3]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[4]  D. Hubel,et al.  Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.

[5]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[6]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[7]  Aapo Hyvärinen,et al.  Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.

[8]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[9]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[10]  Bernt Schiele,et al.  What helps where – and why? Semantic relatedness for knowledge transfer , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Vincent Lepetit,et al.  DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Geoffrey E. Hinton Learning to represent visual input , 2010, Philosophical Transactions of the Royal Society B: Biological Sciences.

[13]  Qiang Yang,et al.  Translated Learning: Transfer Learning across Different Feature Spaces , 2008, NIPS.

[14]  Geoffrey E. Hinton,et al.  Modeling pixel means and covariances using factorized third-order boltzmann machines , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Guillermo Sapiro,et al.  Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[17]  Yi Yao,et al.  Boosting for transfer learning with multiple sources , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Marc'Aurelio Ranzato,et al.  Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[21]  Tong Zhang,et al.  Improved Local Coordinate Coding using Local Tangents , 2010, ICML.

[22]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[23]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[25]  I. Daubechies,et al.  An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[26]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[27]  Stéphane Mallat,et al.  Matching pursuit of images , 1995, Proceedings., International Conference on Image Processing.

[28]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[29]  Barbara Caputo,et al.  Safety in numbers: Learning categories from few examples with multi model knowledge transfer , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[31]  Charu C. Aggarwal,et al.  Towards semantic knowledge propagation from text corpus to web images , 2011, WWW.

[32]  Marc'Aurelio Ranzato,et al.  Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition , 2010, ArXiv.

[33]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[34]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[35]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[37]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[38]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[39]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[40]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[41]  J L Gallant,et al.  Sparse coding and decorrelation in primary visual cortex during natural vision. , 2000, Science.

[42]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[43]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[44]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[46]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[47]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.