论文信息 - Visual Representations and Models: From Latent SVM to Deep Learning

Visual Representations and Models: From Latent SVM to Deep Learning

Two important components of a visual recognition system are representation and model. Both involves the selection and learning of the features that are indicative for recognition and discarding tho ...

Hossein Azizpour | Hossein Azizpour

[1] Alan L. Yuille,et al. The Concave-Convex Procedure , 2003, Neural Computation.

[2] Atsuto Maki,et al. A Baseline for Visual Instance Retrieval with Deep Convolutional Networks , 2014, ICLR 2015.

[3] Cordelia Schmid,et al. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[4] C. V. Jawahar,et al. Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[6] Fei-Fei Li,et al. Combining randomization and discrimination for fine-grained image categorization , 2011, CVPR 2011.

[7] Andrew Zisserman,et al. Discriminative Sub-categorization , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Bernt Schiele,et al. A database for fine grained activity detection of cooking activities , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Massimiliano Pontil,et al. Multi-Task Feature Learning , 2006, NIPS.

[10] Antonio Torralba,et al. Recognizing indoor scenes , 2009, CVPR.

[11] Michael I. Jordan,et al. On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[12] Silvio Savarese,et al. A multi-view probabilistic model for 3D object classes , 2009, CVPR.

[13] Subhransu Maji,et al. Describing people: A poselet-based approach to attribute classification , 2011, 2011 International Conference on Computer Vision.

[14] Alexei A. Efros,et al. Mid-level Visual Element Discovery as Discriminative Mode Seeking , 2013, NIPS.

[15] Cordelia Schmid,et al. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[16] Andrew Zisserman,et al. BiCoS: A Bi-level co-segmentation method for image classification , 2011, 2011 International Conference on Computer Vision.

[17] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[19] Hao Su,et al. Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[20] Derek Hoiem,et al. Diagnosing Error in Object Detectors , 2012, ECCV.

[21] Forrest N. Iandola,et al. Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction , 2013, 2013 IEEE International Conference on Computer Vision.

[22] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[23] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24] James Hays,et al. SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Trevor Darrell,et al. Dynamic visual category learning , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26] Krista A. Ehinger,et al. SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27] Leonidas J. Guibas,et al. Human action recognition by learning bases of action attributes and parts , 2011, 2011 International Conference on Computer Vision.

[28] Venkatesh Saligrama,et al. Local Supervised Learning through Space Partitioning , 2012, NIPS.

[29] Yuan Li,et al. Vector boosting for rotation invariant multi-view face detection , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[30] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[31] Yi Yang,et al. Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[32] Ivan Laptev,et al. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33] Yunde Jia,et al. Discriminatively Trained And-Or Tree Models for Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34] Shenghuo Zhu,et al. Efficient Object Detection and Segmentation for Fine-Grained Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[35] Christian Szegedy,et al. DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Olac Fuentes,et al. Knowledge Transfer in Deep convolutional Neural Nets , 2007, Int. J. Artif. Intell. Tools.

[37] Jian Dong,et al. Subcategory-Aware Object Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38] Tinne Tuytelaars,et al. Mining Mid-level Features for Image Classification , 2014, International Journal of Computer Vision.

[39] Patrick Gros,et al. Asymmetric hamming embedding: taking the best of our bits for large scale image search , 2011, ACM Multimedia.

[40] Jordi Gonzàlez,et al. A coarse-to-fine approach for fast deformable object detection , 2011, CVPR 2011.

[41] Pedro F. Felzenszwalb,et al. Reconfigurable models for scene recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42] Daphne Koller,et al. Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[43] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[44] Yong Jae Lee,et al. AverageExplorer: interactive exploration and alignment of visual data collections , 2014, ACM Trans. Graph..

[45] Qiang Chen,et al. Contextualizing Object Detection and Classification , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46] Andrew Zisserman,et al. Smooth object retrieval using a bag of boundaries , 2011, 2011 International Conference on Computer Vision.

[47] Cordelia Schmid,et al. Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48] Trevor Darrell,et al. Pose pooling kernels for sub-category recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[49] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[50] Yang Wang,et al. Kernel Latent SVM for Visual Recognition , 2012, NIPS.

[51] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[52] Pedro F. Felzenszwalb. Object detection grammars , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[53] Michael Isard,et al. Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[54] Peter N. Belhumeur,et al. POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[55] Pietro Perona,et al. Pedestrian detection: A benchmark , 2009, CVPR.

[56] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[57] Stefan Carlsson,et al. Mixture Component Identification and Learning for Visual Recognition , 2012, ECCV.

[58] Andreas Krause,et al. Discriminative Clustering by Regularized Information Maximization , 2010, NIPS.

[59] Andrew Zisserman,et al. Automatic Discovery and Optimization of Parts for Image Classification , 2015, ICLR.

[60] Lorien Y. Pratt,et al. Discriminability-Based Transfer between Neural Networks , 1992, NIPS.

[61] Jiri Matas,et al. Learning a Fine Vocabulary , 2010, ECCV.

[62] Ali Farhadi,et al. Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[63] Andreas E. Savakis,et al. Sparse Representations and Distance Learning for Attribute Based Category Recognition , 2010, ECCV Workshops.

[64] Ali Farhadi,et al. Recognition using visual phrases , 2011, CVPR 2011.

[65] Derek Hoiem,et al. Learning Collections of Part Models for Object Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[66] Svetlana Lazebnik,et al. Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[67] Hervé Jégou,et al. Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening , 2012, ECCV.

[68] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[69] Alexei A. Efros,et al. What makes Paris look like Paris? , 2015, Commun. ACM.

[70] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[71] Alice J. O'Toole,et al. Face Recognition Algorithms Surpass Humans Matching Faces Over Changes in Illumination , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[73] Zhuowen Tu,et al. Harvesting Mid-level Visual Concepts from Large-Scale Internet Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[74] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[75] Trevor Darrell,et al. PANDA: Pose Aligned Networks for Deep Attribute Modeling , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[76] Cewu Lu,et al. Learning Important Spatial Pooling Regions for Scene Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[77] Matthew J. Hausknecht,et al. Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[78] Pietro Perona,et al. Caltech-UCSD Birds 200 , 2010 .

[79] Long Zhu,et al. Active Mask Hierarchies for Object Detection , 2010, ECCV.

[80] Larry S. Davis,et al. Incremental Multiple Kernel Learning for object recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[81] Andrew Zisserman,et al. Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[82] Florent Perronnin,et al. Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[83] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[84] Zaïd Harchaoui,et al. DIFFRAC: a discriminative and flexible framework for clustering , 2007, NIPS.

[85] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[86] Krista A. Ehinger,et al. SUN Database: Exploring a Large Collection of Scene Categories , 2014, International Journal of Computer Vision.

[87] Jonathan Krause,et al. Fine-grained recognition without part annotations , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[88] Andrew Zisserman,et al. Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[89] Thorsten Joachims,et al. Learning structural SVMs with latent variables , 2009, ICML '09.

[90] Guillaume Gravier,et al. Oriented pooling for dense and non-dense rotation-invariant features , 2013, BMVC.

[91] C. V. Jawahar,et al. Cats and dogs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[92] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[93] Jitendra Malik,et al. Analyzing the Performance of Multilayer Neural Networks for Object Recognition , 2014, ECCV.

[94] Stefan Carlsson,et al. Self-tuned Visual Subclass Learning with Shared Samples An Incremental Approach , 2014, ArXiv.

[95] Ivan Laptev,et al. Object Detection Using Strongly-Supervised Deformable Part Models , 2012, ECCV.

[96] Yannis Avrithis,et al. To Aggregate or Not to aggregate: Selective Match Kernels for Image Search , 2013, 2013 IEEE International Conference on Computer Vision.

[97] Jitendra Malik,et al. Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[98] Pietro Perona,et al. Incremental learning of nonparametric Bayesian mixture models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[99] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[100] Yi Yang,et al. Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[101] Cordelia Schmid,et al. Viewpoint-independent object class detection using 3D Feature Maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[102] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[103] Kathrin Klamroth,et al. Biconvex sets and optimization with biconvex functions: a survey and extensions , 2007, Math. Methods Oper. Res..

[104] Jianguo Zhang,et al. The PASCAL Visual Object Classes Challenge , 2006 .

[105] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.

[106] Qiang Chen,et al. Hierarchical matching with side information for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[107] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[108] Alexei A. Efros,et al. Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[109] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[110] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[111] Christoph H. Lampert,et al. Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[112] Arnold W. M. Smeulders,et al. Fine-Grained Categorization by Alignments , 2013, 2013 IEEE International Conference on Computer Vision.

[113] Jean Ponce,et al. Learning Discriminative Part Detectors for Image Classification and Cosegmentation , 2013, 2013 IEEE International Conference on Computer Vision.

[114] Roberto Cipolla,et al. MCBoost: Multiple Classifier Boosting for Perceptual Co-clustering of Images and Visual Features , 2008, NIPS.

[115] Christopher M. Brown. Inherent Bias and Noise in the Hough Transform , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[116] Motorcycles Faces Guitars. Subordinate class recognition using relational object models , 2006 .

[117] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[118] Luc Van Gool,et al. Latent Hough Transform for Object Detection , 2012, ECCV.

[119] Jitendra Malik,et al. Multi-component Models for Object Detection , 2012, ECCV.

[120] Luc Van Gool,et al. The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[121] Xiaofeng Ren,et al. Discriminative Mixture-of-Templates for Viewpoint Classification , 2010, ECCV.

[122] Charless C. Fowlkes,et al. Do We Need More Training Data or Better Models for Object Detection? , 2012, BMVC.

[123] Jitendra Malik,et al. Deformable part models are convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[124] Jason Weston,et al. Trading convexity for scalability , 2006, ICML.

[125] Trevor Darrell,et al. Part-Based R-CNNs for Fine-Grained Category Detection , 2014, ECCV.

[126] Samy Bengio,et al. A Parallel Mixture of SVMs for Very Large Scale Problems , 2001, Neural Computation.

[127] Andrew Zisserman,et al. Efficient Additive Kernels via Explicit Feature Maps , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[128] K. Mikolajczyk,et al. Higher-order Occurrence Pooling on Mid- and Low-level Features: Visual Concept Detection , 2013 .

[129] Svetlana Lazebnik,et al. Scene recognition and weakly supervised object localization with deformable part-based models , 2011, 2011 International Conference on Computer Vision.

[130] Andrea Torsello,et al. Beyond partitions: Allowing overlapping groups in pairwise clustering , 2008, 2008 19th International Conference on Pattern Recognition.

[131] Yang Wang,et al. A Discriminative Latent Model of Object Classes and Attributes , 2010, ECCV.

[132] Bernt Schiele,et al. Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[133] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[134] Trevor Darrell,et al. The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.