Learning Inference Models for Computer Vision

Computer vision can be understood as the ability to perform inference on image data. Breakthroughs in computer vision technology are often marked by advances in inference techniques. This thesis proposes novel inference schemes and demonstrates applications in computer vision. We propose inference techniques for both generative and discriminative vision models. The use of generative models in vision is often hampered by the difficulty of posterior inference. We propose techniques for improving inference in MCMC sampling and message-passing inference. Our inference strategy is to learn separate discriminative models that assist Bayesian inference in a generative model. Experiments on a range of generative models show that the proposed techniques accelerate the inference process and/or converge to better solutions. A main complication in the design of discriminative models is the inclusion of prior knowledge. We concentrate on CNN models and propose a generalization of standard spatial convolutions to bilateral convolutions. We generalize the existing use of bilateral filters and then propose new neural network architectures with learnable bilateral filters, which we call `Bilateral Neural Networks'. Experiments demonstrate the use of the bilateral networks on a wide range of image and video tasks and datasets. In summary, we propose techniques for better inference in several vision models ranging from inverse graphics to freely parameterized neural networks. In generative models, our inference techniques alleviate some of the crucial hurdles in Bayesian posterior inference, paving new ways for the use of model based machine learning in vision. In discriminative CNN models, the proposed filter generalizations aid in the design of new neural network architectures that can handle sparse high-dimensional data as well as provide a way to incorporate prior knowledge into CNNs.

[1]  M. Hebert,et al.  Efficient temporal consistency for streaming video scene analysis , 2013, 2013 IEEE International Conference on Robotics and Automation.

[2]  Stefan Harmeling,et al.  Image denoising: Can plain neural networks compete with BM3D? , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Ben Graham,et al.  Sparse 3D convolutional neural networks , 2015, BMVC.

[4]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[5]  Peyman Milanfar,et al.  Kernel Regression for Image Processing and Reconstruction , 2007, IEEE Transactions on Image Processing.

[6]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[7]  Bin Yu,et al.  Regeneration in Markov chain samplers , 1995 .

[8]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[9]  Harry Shum,et al.  Video object cut and paste , 2005, ACM Trans. Graph..

[10]  C. Lawrence Zitnick,et al.  Fast Edge Detection Using Structured Forests , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Pushmeet Kohli,et al.  Spatial Inference Machines , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  A. W. Rosenbluth,et al.  MONTE CARLO CALCULATION OF THE AVERAGE EXTENSION OF MOLECULAR CHAINS , 1955 .

[13]  D. Teets,et al.  The Discovery of Ceres: How Gauss Became Famous , 1999 .

[14]  Sylvain Paris,et al.  Edge-Preserving Smoothing and Mean-Shift Segmentation of Video Streams , 2008, ECCV.

[15]  Peter V. Gehler,et al.  Human Pose Estimation with Fields of Parts , 2014, ECCV.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Michael J. Black,et al.  Optical Flow with Semantic Segmentation and Localized Layers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  K. Hukushima,et al.  Exchange Monte Carlo Method and Application to Spin Glass Simulations , 1995, cond-mat/9512035.

[19]  Gang Hua,et al.  Face Relighting from a Single Image under Arbitrary Unknown Lighting Conditions , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Rama Chellappa,et al.  Pose-robust albedo estimation from a single image , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[22]  David Mumford,et al.  Occlusion Models for Natural Images: A Statistical Study of a Scale-Invariant Dead Leaves Model , 2004, International Journal of Computer Vision.

[23]  Guillaume Bouchard,et al.  The Tradeoff Between Generative and Discriminative Classifiers , 2004 .

[24]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[25]  Daniel Munoz,et al.  Inference Machines: Parsing Scenes via Iterated Predictions , 2013 .

[26]  Jonathan Tompson,et al.  Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation , 2014, NIPS.

[27]  Alex Pentland,et al.  A Bayesian Computer Vision System for Modeling Human Interactions , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[29]  Vladlen Koltun,et al.  Feature Space Optimization for Semantic Video Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jonathan T. Barron,et al.  Fast bilateral-space stereo for synthetic defocus , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  Geoffrey E. Hinton Mapping Part-Whole Hierarchies into Connectionist Networks , 1990, Artif. Intell..

[34]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[35]  Dani Lischinski,et al.  JumpCut , 2015, ACM Trans. Graph..

[36]  Joost van de Weijer,et al.  Harmony potentials for joint classification and segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[38]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[39]  James J. Little,et al.  Play and Learn: Using Video Games to Train Computer Vision Models , 2016, BMVC.

[40]  Brian Taylor,et al.  Causal video object segmentation from persistence of occlusions , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  A. Dawid The Well-Calibrated Bayesian , 1982 .

[43]  Fatih Murat Porikli,et al.  Saliency-aware geodesic video object segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[45]  Gerard de Haan,et al.  Trained Bilateral Filters and Applications to Coding Artifacts Reduction , 2007, 2007 IEEE International Conference on Image Processing.

[46]  D. Mumford,et al.  Pattern Theory: The Stochastic Analysis of Real-World Signals , 2010 .

[47]  Bodo Rosenhahn,et al.  Interactive Segmentation of High-Resolution Video Content Using Temporally Coherent Superpixels and Graph Cut , 2014, ISVC.

[48]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Peter V. Gehler,et al.  Video Propagation Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Dani Lischinski,et al.  Colorization using optimization , 2004, ACM Trans. Graph..

[51]  Philip H. S. Torr,et al.  Combining Appearance and Structure from Motion Features for Road Scene Understanding , 2009, BMVC.

[52]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Luc Van Gool,et al.  On-line semantic perception using uncertainty , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[54]  James M. Rehg,et al.  Video Segmentation by Tracking Many Figure-Ground Segments , 2013, 2013 IEEE International Conference on Computer Vision.

[55]  Harry Shum,et al.  Image segmentation by data driven Markov chain Monte Carlo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[56]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, ACM Trans. Graph..

[57]  Daniel Cohen-Or,et al.  Bilateral mesh denoising , 2003 .

[58]  Noah Snavely,et al.  Material recognition in the wild with the Materials in Context Database , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  R. Venkatesh Babu,et al.  SeamSeg: Video Object Segmentation Using Patch Seams , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[61]  Michael J. Black,et al.  Video Segmentation via Object Flow , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Sang Uk Lee,et al.  Image and video colorization based on prioritized source propagation , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[63]  Justin Domke,et al.  Parameter learning with truncated message-passing , 2011, CVPR 2011.

[64]  David Salesin,et al.  Video matting of complex scenes , 2002, SIGGRAPH.

[65]  Jörg Weule,et al.  Non-Linear Gaussian Filters Performing Edge Preserving Diffusion , 1995, DAGM-Symposium.

[66]  Antonio Criminisi,et al.  Decision Forests for Computer Vision and Medical Image Analysis , 2013, Advances in Computer Vision and Pattern Recognition.

[67]  David J. Fleet,et al.  Model-based hand tracking with texture, shading and self-occlusions , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[68]  Jonathan T. Barron,et al.  The Fast Bilateral Solver , 2015, ECCV.

[69]  Geoffrey E. Hinton,et al.  Attend, Infer, Repeat: Fast Scene Understanding with Generative Models , 2016, NIPS.

[70]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo in Practice: A Roundtable Discussion , 1998 .

[71]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[72]  Graham W. Taylor,et al.  Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[73]  Peter V. Gehler,et al.  Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Michal Irani,et al.  Video Segmentation by Non-Local Consensus voting , 2014, BMVC.

[75]  Luc Van Gool,et al.  A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Daniel Fried,et al.  Bayesian geometric modeling of indoor scenes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[77]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[78]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[79]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[80]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[81]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[82]  C. Lawrence Zitnick,et al.  Structured Forests for Fast Edge Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[83]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[84]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[85]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[86]  Maneesh Agrawala,et al.  Interactive video cutout , 2005, ACM Trans. Graph..

[87]  Peter V. Gehler,et al.  Superpixel Convolutional Networks Using Bilateral Inceptions , 2015, ECCV.

[88]  Ian D. Reid,et al.  Deeply Learning the Messages in Message Passing Inference , 2015, NIPS.

[89]  Atsushi Nakazawa,et al.  Motion Coherent Tracking Using Multi-label MRF Optimization , 2012, International Journal of Computer Vision.

[90]  Yutaka Ohtake,et al.  Mesh smoothing via mean and median filtering applied to face normals , 2002, Geometric Modeling and Processing. Theory and Applications. GMP 2002. Proceedings.

[91]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[92]  Simon J. D. Prince,et al.  Computer Vision: Models, Learning, and Inference , 2012 .

[93]  Luc Van Gool,et al.  Fast Optical Flow Using Dense Inverse Search , 2016, ECCV.

[94]  Vibhav Vineet,et al.  PoseField: An Efficient Mean-Field Based Method for Joint Estimation of Human Pose, Segmentation, and Depth , 2013, EMMCVPR.

[95]  Peter V. Gehler,et al.  Efficient 2D and 3D Facade Segmentation Using Auto-Context , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[96]  Max Welling,et al.  Distributed and Adaptive Darting Monte Carlo through Regenerations , 2013, AISTATS.

[97]  Ling Shao,et al.  Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[98]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[99]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[100]  Frédo Durand,et al.  A Fast Approximation of the Bilateral Filter Using a Signal Processing Approach , 2006, International Journal of Computer Vision.

[101]  William T. Freeman,et al.  Nonparametric belief propagation , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[102]  Vittorio Ferrari,et al.  Fast Object Segmentation in Unconstrained Video , 2013, 2013 IEEE International Conference on Computer Vision.

[103]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[104]  G. Roberts,et al.  Adaptive Markov Chain Monte Carlo through Regeneration , 1998 .

[105]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[106]  Bruce G. Baumgart,et al.  Geometric modeling for computer vision. , 1974 .

[107]  Ling Shao,et al.  Computer vision for RGB-D sensors: Kinect and its applications [special issue intro.] , 2013, IEEE Transactions on Cybernetics.

[108]  Varun Ramakrishna,et al.  Pose Machines: Articulated Pose Estimation via Inference Machines , 2014, ECCV.

[109]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[110]  Jiayan Jiang,et al.  Efficient scale space auto-context for image segmentation and labeling , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[111]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[112]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[113]  Vibhav Vineet,et al.  Filter-Based Mean-Field Inference for Random Fields with Higher-Order Terms and Product Label-Spaces , 2012, International Journal of Computer Vision.

[114]  Guillermo Sapiro,et al.  Video SnapCut: robust video object cutout using localized classifiers , 2009, SIGGRAPH 2009.

[115]  Stephen M. Smith,et al.  SUSAN—A New Approach to Low Level Image Processing , 1997, International Journal of Computer Vision.

[116]  Luc Van Gool,et al.  Segmentation-Based Urban Traffic Scene Understanding , 2009, BMVC.

[117]  Raquel Urtasun,et al.  Fully Connected Deep Structured Networks , 2015, ArXiv.

[118]  Tom Minka,et al.  Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[119]  Vincent Lepetit,et al.  A fast local descriptor for dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[120]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[121]  Lei Zhang,et al.  Face recognition from a single training image under arbitrary unknown lighting using spherical harmonics , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[122]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[123]  J. Hammersley,et al.  Poor Man's Monte Carlo , 1954 .

[124]  Manuel Menezes de Oliveira Neto,et al.  Domain transform for edge-aware image and video processing , 2011, ACM Trans. Graph..

[125]  Benjamin Graham,et al.  Spatially-sparse convolutional neural networks , 2014, ArXiv.

[126]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[127]  Joshua B. Tenenbaum,et al.  Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs , 2013, NIPS.

[128]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[129]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[130]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[131]  Matthias Bethge,et al.  Generative Image Modeling Using Spatial LSTMs , 2015, NIPS.

[132]  Kurt Keutzer,et al.  Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow , 2010, ECCV.

[133]  Cristian Sminchisescu,et al.  Generalized Darting Monte Carlo , 2007, AISTATS.

[134]  Jonathan T. Barron,et al.  Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[135]  Larry S. Davis,et al.  Interactive video segmentation using occlusion boundaries and temporally coherent superpixels , 2014, IEEE Winter Conference on Applications of Computer Vision.

[136]  Yu-Chiang Frank Wang,et al.  Propagated image filtering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[137]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[138]  Richard S. Zemel,et al.  Mean-Field Networks , 2014, ArXiv.

[139]  David A. McAllester,et al.  Particle Belief Propagation , 2009, AISTATS.

[140]  Guosheng Lin,et al.  Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[141]  Gregory Shakhnarovich,et al.  Feedforward semantic segmentation with zoom-out features , 2014, CVPR.

[142]  Mubarak Shah,et al.  Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[143]  Andrew Adams,et al.  Fast High‐Dimensional Filtering Using the Permutohedral Lattice , 2010, Comput. Graph. Forum.

[144]  Cristian Sminchisescu,et al.  Matrix Backpropagation for Deep Networks with Structured Layers , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[145]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[146]  David Cahan,et al.  Hermann von Helmholtz and the Foundations of Nineteenth-Century Science , 1993 .

[147]  Vasant Honavar,et al.  Discriminatively trained Markov model for sequence classification , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[148]  Justin Domke,et al.  Learning Graphical Model Parameters with Approximate Marginal Inference , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[149]  Ulf Grenander Pattern Synthesis: Lectures in Pattern Theory , 1976 .

[150]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[151]  Yong Jae Lee,et al.  Key-segments for video object segmentation , 2011, 2011 International Conference on Computer Vision.

[152]  T. Minka Discriminative models, not discriminative training , 2005 .

[153]  Subhransu Maji,et al.  Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[154]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[155]  Ira Kemelmacher-Shlizerman,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 3d Face Reconstruction from a Single Image Using a Single Reference Face Shape , 2022 .

[156]  Alex Zelinsky,et al.  Learning OpenCV---Computer Vision with the OpenCV Library (Bradski, G.R. et al.; 2008)[On the Shelf] , 2009, IEEE Robotics & Automation Magazine.

[157]  Kristen Grauman,et al.  Supervoxel-Consistent Foreground Propagation in Video , 2014, ECCV.

[158]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[159]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[160]  Pushmeet Kohli,et al.  Just-In-Time Learning for Fast and Flexible Inference , 2014, NIPS.

[161]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[162]  Ruslan Salakhutdinov,et al.  Annealing between distributions by averaging moments , 2013, NIPS.

[163]  Danny Barash,et al.  A Fundamental Relationship between Bilateral Filtering, Adaptive Smoothing, and the Nonlinear Diffusion Equation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[164]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[165]  Suyash P. Awate,et al.  Higher-order image statistics for unsupervised, information-theoretic, adaptive, image filtering , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[166]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[167]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[168]  Xiaoxiao Li,et al.  Semantic Image Segmentation via Deep Parsing Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[169]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[170]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[171]  Renjie Liao,et al.  Deep Edge-Aware Filters , 2015, ICML.

[172]  Noah D. Goodman,et al.  Learning Stochastic Inverses , 2013, NIPS.

[173]  Michael J. Black,et al.  A Fully-Connected Layered Model of Foreground and Background Flow , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[174]  Rong Zhang,et al.  Integrating bottom-up/top-down for object recognition by data driven Markov chain Monte Carlo , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[175]  Zhuowen Tu,et al.  Top-Down Learning for Structured Labeling with Convolutional Pseudoprior , 2015, ECCV.

[176]  Horst Bischof,et al.  Variational Depth Superresolution Using Example-Based Edge Representations , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[177]  Wang,et al.  Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.

[178]  E. Nummelin,et al.  A splitting technique for Harris recurrent Markov chains , 1978 .

[179]  A. Yuille,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Vision as Bayesian inference: analysis by synthesis? , 2022 .

[180]  Sebastian Nowozin,et al.  The informed sampler: A discriminative approach to Bayesian inference in generative computer vision models , 2014, Comput. Vis. Image Underst..

[181]  Iasonas Kokkinos,et al.  Fast, Exact and Multi-scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs , 2016, ECCV.

[182]  Sebastian Nowozin,et al.  On Parameter Learning in CRF-Based Approaches to Object Class Image Segmentation , 2010, ECCV.

[183]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[184]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[185]  Berthold K. P. Horn Understanding Image Intensities , 1977, Artif. Intell..

[186]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[187]  Brendan J. Frey,et al.  A Revolution: Belief Propagation in Graphs with Cycles , 1997, NIPS.

[188]  Markus H. Gross,et al.  Fully Connected Object Proposals for Video Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[189]  J. Rosenthal,et al.  On adaptive Markov chain Monte Carlo algorithms , 2005 .

[190]  Ian D. Reid,et al.  gSLICr: SLIC superpixels at over 250Hz , 2015, ArXiv.

[191]  Varun Jampani,et al.  Consensus Message Passing for Layered Graphical Models , 2014, AISTATS.

[192]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[193]  Marc Levoy,et al.  Gaussian KD-trees for fast high-dimensional filtering , 2009, ACM Trans. Graph..

[194]  Manuel Menezes de Oliveira Neto,et al.  Adaptive manifolds for real-time high-dimensional filtering , 2012, ACM Trans. Graph..

[195]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[196]  Rama Chellappa,et al.  Robust Estimation of Albedo for Illumination-invariant Matching and Shape Recovery , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[197]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[198]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[199]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[200]  Noah Snavely,et al.  Intrinsic images in the wild , 2014, ACM Trans. Graph..

[201]  C. Bregler,et al.  Large displacement optical flow , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[202]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH '05.

[203]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[204]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[205]  Peyman Milanfar,et al.  A Tour of Modern Image Filtering , 2013 .

[206]  Scott Cohen,et al.  LIVEcut: Learning-based interactive video segmentation by evaluation of multiple propagated cues , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[207]  Alan L. Yuille,et al.  Learning Deep Structured Models , 2014, ICML.

[208]  Charles M. Bishop,et al.  Variational Message Passing , 2005, J. Mach. Learn. Res..

[209]  Michael J. Black,et al.  Coregistration: Simultaneous Alignment and Modeling of Articulated 3D Shape , 2012, ECCV.

[210]  Narendra Ahuja,et al.  Deep Joint Image Filtering , 2016, ECCV.

[211]  Trevor Darrell,et al.  Clockwork Convnets for Video Semantic Segmentation , 2016, ECCV Workshops.

[212]  Xiaoou Tang,et al.  Depth Map Super-Resolution by Deep Multi-Scale Guidance , 2016, ECCV.

[213]  K. Athreya,et al.  A New Approach to the Limit Theory of Recurrent Markov Chains , 1978 .

[214]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[215]  Rynson W. H. Lau,et al.  SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection , 2015, International Journal of Computer Vision.

[216]  Pietro Perona,et al.  A discriminative framework for modelling object classes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[217]  Jean-Denis Durou,et al.  Solving the Uncalibrated Photometric Stereo Problem Using Total Variation , 2013, SSVM.

[218]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[219]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[220]  Mun Wai Lee,et al.  Proposal maps driven MCMC for estimating human body pose in static images , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[221]  Zhuowen Tu,et al.  Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[222]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[223]  Peter V. Gehler,et al.  Permutohedral Lattice CNNs , 2015, ICLR.

[224]  Frédo Durand,et al.  Non-iterative, feature-preserving mesh smoothing , 2003, ACM Trans. Graph..

[225]  Martial Hebert,et al.  Stacked Hierarchical Labeling , 2010, ECCV.

[226]  Roberto Cipolla,et al.  Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[227]  Sebastian Ramos,et al.  The Cityscapes Dataset , 2015, CVPR 2015.

[228]  Jan Kautz,et al.  Fully-Connected CRFs with Non-Parametric Pairwise Potential , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[229]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[230]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[231]  Markus Gross,et al.  Practical temporal consistency for image-based graphics applications , 2012, ACM Trans. Graph..

[232]  Murali Haran,et al.  Markov chain Monte Carlo: Can we trust the third significant figure? , 2007, math/0703746.

[233]  Jason J. Corso,et al.  Temporally consistent multi-class video-object segmentation with the Video Graph-Shifts algorithm , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[234]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[235]  Bernt Schiele,et al.  Learning Video Object Segmentation from Static Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[236]  Silvio Savarese,et al.  Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[237]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[238]  Song-Chun Zhu,et al.  Learning generic prior models for visual computation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[239]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[240]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[241]  Xuming He,et al.  Superpixel Graph Label Transfer with Learned Distance Metric , 2014, ECCV.

[242]  Daniel Tarlow,et al.  Learning to Pass Expectation Propagation Messages , 2013, NIPS.

[243]  Christopher K. I. Williams,et al.  The Shape Boltzmann Machine: A Strong Model of Object Shape , 2012, International Journal of Computer Vision.

[244]  Tobias Ritschel,et al.  On-line learning of parametric mixture models for light transport simulation , 2014, ACM Trans. Graph..

[245]  Luc Van Gool,et al.  One-Shot Video Object Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[246]  David Salesin,et al.  Keyframe-based tracking for rotoscoping and animation , 2004, ACM Trans. Graph..

[247]  David J. Fleet,et al.  Robustly Estimating Changes in Image Appearance , 2000, Comput. Vis. Image Underst..

[248]  Stephen J. Roberts,et al.  A tutorial on variational Bayesian inference , 2012, Artificial Intelligence Review.

[249]  Alexander Sorkine-Hornung,et al.  Bilateral Space Video Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[250]  Pat Hanrahan,et al.  A signal-processing framework for inverse rendering , 2001, SIGGRAPH.

[251]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[252]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[253]  Max Welling,et al.  Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget , 2013, ICML 2014.

[254]  Pushmeet Kohli,et al.  Dynamic Graph Cuts for Efficient Inference in Markov Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[255]  Peter V. Gehler,et al.  Efficient Facade Segmentation Using Auto-context , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[256]  Hanqiu Sun,et al.  Video Colorization Using Parallel Optimization in Feature Space , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[257]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[258]  Dani Lischinski,et al.  A Closed-Form Solution to Natural Image Matting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[259]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[260]  Truong Q. Nguyen,et al.  Semantic video segmentation: Exploring inference efficiency , 2015, 2015 International SoC Design Conference (ISOCC).

[261]  Geoffrey E. Hinton,et al.  Deep Lambertian Networks , 2012, ICML.

[262]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[263]  Martial Hebert,et al.  Learning message-passing inference machines for structured prediction , 2011, CVPR 2011.

[264]  William T. Freeman,et al.  Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.

[265]  Brendan J. Frey,et al.  Advances in Algorithms for Inference and Learning in Complex Probability Models , 2003 .

[266]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.