Learning a perceptual manifold with deep features for animation video resequencing

We propose a novel deep learning framework for animation video resequencing. Our system produces new video sequences by minimizing a perceptual distance of images from an existing animation video clip. To measure perceptual distance, we utilize the activations of convolutional neural networks and learn a perceptual distance by training these features on a small network with data comprised of human perceptual judgments. We show that with this perceptual metric and graph-based manifold learning techniques, our framework can produce new smooth and visually appealing animation video results for a variety of animation video styles. In contrast to previous work on animation video resequencing, the proposed framework applies to wide range of image styles and does not require hand-crafted feature extraction, background subtraction, or feature correspondence. In addition, we also show that our framework has applications to appealing arrange unordered collections of images.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  G. Laporte The traveling salesman problem: An overview of exact and approximate algorithms , 1992 .

[4]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[5]  Richard Szeliski,et al.  Video textures , 2000, SIGGRAPH.

[6]  Meng Wang,et al.  Semisupervised Multiview Distance Metric Learning for Cartoon Synthesis , 2012, IEEE Transactions on Image Processing.

[7]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[8]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[9]  J. Kruskal On the shortest spanning subtree of a graph and the traveling salesman problem , 1956 .

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[12]  Jun Yu,et al.  Interactive cartoon reusing by transfer learning , 2012, Signal Process..

[13]  Bobby Bodenheimer,et al.  Cartoon textures , 2004, SCA '04.

[14]  Jun Yu,et al.  On Combining Multiple Features for Cartoon Character Retrieval and Clip Synthesis , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  Taku Komura,et al.  Learning motion manifolds with convolutional autoencoders , 2015, SIGGRAPH Asia Technical Briefs.

[16]  Irfan A. Essa,et al.  Machine Learning for Video-Based Rendering , 2000, NIPS.

[17]  Daniel Cohen-Or,et al.  Patch2Vec: Globally Consistent Image Patch Representation , 2017, Comput. Graph. Forum.

[18]  Irfan A. Essa,et al.  Controlled animation of video sprites , 2002, SCA '02.

[19]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[20]  Klaus Schöffmann,et al.  Similarity-Based Visualization for Image Browsing Revisited , 2011, 2011 IEEE International Symposium on Multimedia.

[21]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Haibin Ling,et al.  Shape Classification Using the Inner-Distance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yann LeCun,et al.  Synergistic Face Detection and Pose Estimation with Energy-Based Models , 2004, J. Mach. Learn. Res..

[25]  Djemel Ziou,et al.  Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.

[26]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[27]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[29]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  E. Stacy A Generalization of the Gamma Distribution , 1962 .

[32]  Daniel Cohen-Or,et al.  Smooth Image Sequences for Data‐driven Morphing , 2016, Comput. Graph. Forum.