Nonlinear Regression via Deep Negative Correlation Learning

Nonlinear regression has been extensively employed in many computer vision problems (e.g., crowd counting, age estimation, affective computing). Under the umbrella of deep learning, two common solutions exist i) transforming nonlinear regression to a robust loss function which is jointly optimizable with the deep convolutional network, and ii) utilizing ensemble of deep networks. Although some improved performance is achieved, the former may be lacking due to the intrinsic limitation of choosing a single hypothesis and the latter may suffer from much larger computational complexity. To cope with those issues, we propose to regress via an efficient “divide and conquer” manner. The core of our approach is the generalization of negative correlation learning that has been shown, both theoretically and empirically, to work well for non-deep regression problems. Without extra parameters, the proposed method controls the bias-variance-covariance trade-off systematically and usually yields a deep regression ensemble where each base model is both “accurate” and “diversified.” Moreover, we show that each sub-problem in the proposed method has less Rademacher Complexity and thus is easier to optimize. Extensive experiments on several diverse and challenging tasks including crowd counting, personality analysis, age estimation, and image super-resolution demonstrate the superiority over challenging baselines as well as the versatility of the proposed method. The source code and trained models are available on our project page: https://mmcheng.net/dncl/.

[1]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[2]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[3]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Thomas G. Dietterich Ensemble Methods in Machine Learning , 2000, Multiple Classifier Systems.

[5]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[6]  Peter L. Bartlett,et al.  Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[7]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[8]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[9]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[10]  V. Koltchinskii,et al.  Empirical margin distributions and bounding the generalization error of combined classifiers , 2002, math/0405343.

[11]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Michael J. Black,et al.  On the unification of line processes, outlier rejection, and robust statistics with applications in early vision , 1996, International Journal of Computer Vision.

[14]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[17]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[18]  Gustavo de Veciana,et al.  An information fidelity criterion for image quality assessment using natural scene statistics , 2005, IEEE Transactions on Image Processing.

[19]  Peter Tiño,et al.  Managing Diversity in Regression Ensembles , 2005, J. Mach. Learn. Res..

[20]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Xin Yao,et al.  An analysis of diversity measures , 2006, Machine Learning.

[22]  Karl Ricanek,et al.  MORPH: a longitudinal image database of normal adult age-progression , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[23]  Zhi-Hua Zhou,et al.  Automatic Age Estimation Based on Facial Aging Patterns , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yun Fu,et al.  Image-Based Human Age Estimation by Manifold Learning and Locally Adjusted Robust Regression , 2008, IEEE Transactions on Image Processing.

[25]  Haibin Ling,et al.  Age regression from faces using random forests , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[26]  Thomas S. Huang,et al.  Human age estimation using bio-inspired features , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[28]  Dit-Yan Yeung,et al.  Multi-task warped Gaussian process for personalized age estimation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[30]  Yi-Ping Hung,et al.  2010 International Conference on Pattern Recognition A RANKING APPROACH FOR HUMAN AGE ESTIMATION BASED ON FACE IMAGES , 2022 .

[31]  Michael Elad,et al.  On Single Image Scale-Up Using Sparse-Representations , 2010, Curves and Surfaces.

[32]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[33]  Thomas S. Huang,et al.  Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[34]  Ivan Laptev,et al.  Density-aware person detection and tracking in crowds , 2011, ICCV.

[35]  Guodong Guo,et al.  Simultaneous dimensionality reduction and human age estimation via kernel partial least squares regression , 2011, CVPR 2011.

[36]  Yi-Ping Hung,et al.  Ordinal hyperplanes ranker with cost sensitivities for age estimation , 2011, CVPR 2011.

[37]  Nicu Sebe,et al.  Please, tell me about yourself: automatic personality assessment using short self-presentations , 2011, ICMI '11.

[38]  Ching Y. Suen,et al.  Contourlet appearance model for facial age estimation , 2011, 2011 International Joint Conference on Biometrics (IJCB).

[39]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[40]  Aline Roumy,et al.  Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding , 2012, BMVC.

[41]  Daniel Gatica-Perez,et al.  FaceTube: predicting personality from facial expressions of emotion in online conversational video , 2012, ICMI '12.

[42]  Zhi-Hua Zhou,et al.  Facial Age Estimation by Learning from Label Distributions , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Haroon Idrees,et al.  Multi-source Multi-scale Counting in Extremely Dense Crowd Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Junji Yamato,et al.  Inferring mood in ubiquitous conversational video , 2013, MUM.

[46]  Shaogang Gong,et al.  Cumulative Attribute Space for Age and Crowd Density Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[49]  Liang Lin,et al.  Deep Joint Task Learning for Generic Object Extraction , 2014, NIPS.

[50]  Mehryar Mohri,et al.  Deep Boosting , 2014, ICML.

[51]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[52]  Dianhui Wang,et al.  Fast decorrelated neural network ensembles with random weights , 2014, Inf. Sci..

[53]  Stan Z. Li,et al.  Age Estimation by Multi-scale Convolutional Network , 2014, ACCV.

[54]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[55]  Narendra Ahuja,et al.  Single image super-resolution from transformed self-exemplars , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Nicu Sebe,et al.  Inference of personality traits and affect schedule by analysis of spontaneous reactions to affective videos , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[58]  Horst Bischof,et al.  Fast and accurate image upscaling with super-resolution forests , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Xiaoming Liu,et al.  Demographic Estimation from Face Images: Human vs. Machine Performance , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[61]  Xiaochun Cao,et al.  Deep People Counting in Extremely Dense Crowds , 2015, ACM Multimedia.

[62]  Nassir Navab,et al.  Robust Optimization for Deep Regression , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[63]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Xiaolong Wang,et al.  Deeply-Learned Feature for Age Estimation , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[65]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[66]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Yan Tong,et al.  Incremental Boosting Convolutional Neural Network for Facial Action Unit Recognition , 2017, NIPS.

[68]  Kyoung Mu Lee,et al.  Deeply-Recursive Convolutional Network for Image Super-Resolution , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Luc Van Gool,et al.  Deep Expectation of Real and Apparent Age from a Single Image Without Facial Landmarks , 2016, International Journal of Computer Vision.

[70]  Sergio Escalera,et al.  ChaLearn LAP 2016: First Round Challenge on First Impressions - Dataset and Results , 2016, ECCV Workshops.

[71]  Srinivas S. Kruthiventi,et al.  CrowdNet: A Deep Convolutional Network for Dense Crowd Counting , 2016, ACM Multimedia.

[72]  Yu-Bin Yang,et al.  Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections , 2016, NIPS.

[73]  Timothy F. Cootes,et al.  Overview of research on facial ageing using the FG-NET ageing database , 2016, IET Biom..

[74]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[75]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Yangdong Ye,et al.  Rank-based pooling for deep convolutional neural networks , 2016, Neural Networks.

[77]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Daniel Oñoro-Rubio,et al.  Towards Perspective-Free Object Counting with Deep Learning , 2016, ECCV.

[79]  Anurag Mittal,et al.  Bi-modal First Impressions Recognition Using Temporally Ordered Deep Audio and Stochastic Visual Features , 2016, ECCV Workshops.

[80]  Xiu-Shen Wei,et al.  Deep Bimodal Regression for Apparent Personality Analysis , 2016, ECCV Workshops.

[81]  Marcel van Gerven,et al.  Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition , 2016, ECCV Workshops.

[82]  Gang Hua,et al.  Ordinal Regression with Multiple Output CNN for Age Estimation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[83]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[84]  Xu Yang,et al.  Deep Age Distribution Learning for Apparent Age Estimation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[85]  Andrew Zisserman,et al.  Counting in the Wild , 2016, ECCV.

[86]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[87]  Lior Wolf,et al.  Learning to Count with CNN Boosting , 2016, ECCV.

[88]  Xiaogang Wang,et al.  STCT: Sequentially Training Convolutional Networks for Visual Tracking , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[89]  Ponnuthurai N. Suganthan,et al.  Ensemble Classification and Regression-Recent Developments, Applications and Future Directions [Review Article] , 2016, IEEE Computational Intelligence Magazine.

[90]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[91]  Albert Ali Salah,et al.  Combining Deep Facial and Ambient Features for First Impression Estimation , 2016, ECCV Workshops.

[92]  Kyoung Mu Lee,et al.  Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[93]  Michael Cogswell,et al.  Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles , 2016, NIPS.

[94]  Luc Van Gool,et al.  Some Like It Hot — Visual Guidance for Preference Prediction , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[95]  Luc Van Gool,et al.  Temporal Segment Networks: Towards Good Practices for Deep Action Recognition , 2016, ECCV.

[96]  Noel E. O'Connor,et al.  Fully Convolutional Crowd Counting on Highly Congested Scenes , 2016, VISIGRAPP.

[97]  Narendra Ahuja,et al.  Robust Visual Tracking Using Oblique Random Forests , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[98]  Xiang Bai,et al.  Richer Convolutional Features for Edge Detection , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[99]  Ming Dong,et al.  Using Ranking-CNN for Age Estimation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[100]  Kai Zhao,et al.  Label Distribution Learning Forests , 2017, NIPS.

[101]  Luc Van Gool,et al.  NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[102]  Paolo Favaro,et al.  Representation Learning by Learning to Count , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[103]  Shiv Surya,et al.  Switching Convolutional Neural Network for Crowd Counting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[104]  Jian Yang,et al.  Image Super-Resolution via Deep Recursive Residual Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[105]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[106]  Ponnuthurai N. Suganthan,et al.  Oblique random forest ensemble via Least Square Estimation for time series forecasting , 2017, Inf. Sci..

[107]  P. N. Suganthan,et al.  Benchmarking Ensemble Classifiers with Novel Co-Trained Kernal Ridge Regression and Random Vector Functional Link Ensembles [Research Frontier] , 2017, IEEE Computational Intelligence Magazine.

[108]  Xiangmin Xu,et al.  Multi-scale convolutional neural networks for crowd counting , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[109]  Luc Van Gool,et al.  Anchored Regression Networks Applied to Age Estimation and Super Resolution , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[110]  Thomas S. Huang,et al.  Image Super-Resolution via Dual-State Recurrent Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[111]  Guoyan Zheng,et al.  Crowd Counting with Deep Negative Correlation Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[112]  Gregory Shakhnarovich,et al.  Deep Back-Projection Networks for Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[113]  Yun Fu,et al.  Residual Dense Network for Image Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[114]  Sergio Escalera,et al.  First Impressions: A Survey on Computer Vision-Based Apparent Personality Trait Analysis , 2018, ArXiv.

[115]  Yun Fu,et al.  Image Super-Resolution Using Very Deep Residual Channel Attention Networks , 2018, ECCV.

[116]  Deyu Meng,et al.  DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[117]  Kai Li,et al.  Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[118]  Bo Wang,et al.  Deep Regression Forests for Age Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[119]  Le Zhang,et al.  Multiscale Multitask Deep NetVLAD for Crowd Counting , 2018, IEEE Transactions on Industrial Informatics.

[120]  Stéphane Ayache,et al.  Explaining First Impressions: Modeling, Recognizing, and Explaining Apparent Personality from Videos , 2018, ArXiv.

[121]  Yun Fu,et al.  Residual Non-local Attention Networks for Image Restoration , 2019, ICLR.

[122]  Qionghai Dai,et al.  Collaborative Representation Cascade for Single-Image Super-Resolution , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[123]  Kai Zhao,et al.  Res2Net: A New Multi-Scale Backbone Architecture , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[124]  Sergio Escalera,et al.  First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis , 2018, IEEE Transactions on Affective Computing.