Spectral attribute learning for visual regression

A number of computer vision problems such as facial age estimation, crowd counting and pose estimation can be solved by learning regression mapping on low-level imagery features. We show that visual regression can be substantially improved by two-stage regression where imagery features are first mapped to an attribute space which explicitly models latent correlations across continuously-changing output. We propose an approach to automatically discover spectral attributes which avoids manual work required for defining hand-crafted attribute representations. Visual attribute regression outperforms direct visual regression and our spectral attribute visual regression achieves state-of-the-art accuracy in multiple applications. HighlightsSpectral attributes avoid manually-engineered attribute construction.Spectral attributes handle multiple correlated regression outputs.Spectral attributes achieve state-of-the-art performance on various benchmarks.

[1]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[2]  Ming Liu,et al.  Regression from patch-kernel , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[4]  Andrew Zisserman,et al.  Deep Fisher Networks for Large-Scale Image Classification , 2013, NIPS.

[5]  Shuyuan Yang,et al.  Global discriminative-based nonnegative spectral clustering , 2016, Pattern Recognit..

[6]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Yuan Yan Tang,et al.  Person Re-Identification by Dual-Regularized KISS Metric Learning , 2016, IEEE Transactions on Image Processing.

[8]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[9]  Michal Irani,et al.  Detecting and sketching the common , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[11]  Shuicheng Yan,et al.  Learning Auto-Structured Regressor from Uncertain Nonnegative Labels , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Andrew Zisserman,et al.  Learning Visual Attributes , 2007, NIPS.

[13]  Charles A. Micchelli,et al.  On Spectral Learning , 2010, J. Mach. Learn. Res..

[14]  Yi-Ping Hung,et al.  Ordinal hyperplanes ranker with cost sensitivities for age estimation , 2011, CVPR 2011.

[15]  Shaogang Gong,et al.  Attribute Learning for Understanding Unstructured Social Activity , 2012, ECCV.

[16]  Wei Chen,et al.  Actionness Ranking with Lattice Conditional Ordinal Random Fields , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Shaogang Gong,et al.  Feature Mining for Localised Crowd Counting , 2012, BMVC.

[18]  Jitendra Malik,et al.  Shape matching and object recognition using low distortion correspondences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Yun Fu,et al.  Age Synthesis and Estimation via Faces: A Survey , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Concha Bielza,et al.  A survey on multi‐output regression , 2015, WIREs Data Mining Knowl. Discov..

[21]  James Hays,et al.  SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Kristen Grauman,et al.  Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.

[23]  Shaogang Gong,et al.  Cumulative Attribute Space for Age and Crowd Density Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Alexander C. Berg,et al.  Automatic Attribute Discovery and Characterization from Noisy Web Data , 2010, ECCV.

[25]  Fei Wang,et al.  Fast semi-supervised clustering with enhanced spectral embedding , 2012, Pattern Recognit..

[26]  Shuicheng Yan,et al.  Ranking with Uncertain Labels , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[27]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[28]  J. Crowley,et al.  Estimating Face orientation from Robust Detection of Salient Facial Structures , 2004 .

[29]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[30]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Ke Chen,et al.  Learning to Count with Back-propagated Information , 2014, 2014 22nd International Conference on Pattern Recognition.

[32]  Xuelong Li,et al.  Hessian Regularized Support Vector Machines for Mobile Image Annotation on the Cloud , 2013, IEEE Transactions on Multimedia.

[33]  Xuelong Li,et al.  Person Re-Identification by Regularized Smoothing KISS Metric Learning , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Derek Hoiem,et al.  Category Independent Object Proposals , 2010, ECCV.

[35]  Yun Fu,et al.  Image-Based Human Age Estimation by Manifold Learning and Locally Adjusted Robust Regression , 2008, IEEE Transactions on Image Processing.

[36]  Ahmed M. Elgammal,et al.  One-shot multi-set non-rigid feature-spatial matching , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[38]  Zhi-Hua Zhou,et al.  Automatic Age Estimation Based on Facial Aging Patterns , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Zhi-Hua Zhou,et al.  Facial Age Estimation by Learning from Label Distributions , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Nuno Vasconcelos,et al.  Counting People With Low-Level Features and Bayesian Regression , 2012, IEEE Transactions on Image Processing.

[41]  Svetha Venkatesh,et al.  Face Recognition Using Kernel Ridge Regression , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Xin Geng,et al.  Head Pose Estimation Based on Multivariate Label Distribution , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Larry S. Davis,et al.  On partial least squares in head pose estimation: How to simultaneously deal with misalignment , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[45]  Yun Fu,et al.  Head pose estimation: Classification or regression? , 2008, 2008 19th International Conference on Pattern Recognition.

[46]  Fang Liu,et al.  Coupled compressed sensing inspired sparse spatial-spectral LSSVM for hyperspectral image classification , 2015, Knowl. Based Syst..

[47]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[48]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[49]  Ke Chen,et al.  Cyclic motion generation of multi-link planar robot performing square end-effector trajectory analyzed via gradient-descent and Zhang et al’s neural-dynamic methods , 2008, 2008 2nd International Symposium on Systems and Control in Aerospace and Astronautics.

[50]  Vladimir Pavlovic,et al.  Structured Output Ordinal Regression for Dynamic Facial Emotion Intensity Prediction , 2010, ECCV.

[51]  Francis R. Bach,et al.  Multi-task regression using minimal penalties , 2011, J. Mach. Learn. Res..

[52]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Yangyang Li,et al.  Self-representation based dual-graph regularized feature selection clustering , 2016, Neurocomputing.

[54]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Dit-Yan Yeung,et al.  Multi-task warped Gaussian process for personalized age estimation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[57]  Yi-Ping Hung,et al.  2010 International Conference on Pattern Recognition A RANKING APPROACH FOR HUMAN AGE ESTIMATION BASED ON FACE IMAGES , 2022 .

[58]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[59]  Shaogang Gong,et al.  From Semi-supervised to Transfer Counting of Crowds , 2013, 2013 IEEE International Conference on Computer Vision.

[60]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[61]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[62]  Thomas S. Huang,et al.  Human age estimation using bio-inspired features , 2009, CVPR.

[63]  Yun Fu,et al.  Human age estimation using bio-inspired features , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[64]  Rama Chellappa,et al.  Growing Regression Forests by Classification: Applications to Object Pose Estimation , 2013, ECCV.

[65]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.