论文信息 - Higher level techniques for the artistic rendering of images and video

Higher level techniques for the artistic rendering of images and video

ion for video in which the server (implementing the front end) determines the video content, whilst the client-side (implementing the back end) determines the style in which that video is rendered. Splitting the responsibilities of video content provision and content visualisation between the client and server is a promising direction for development of our Video Paintbox architecture. Aside from the benefits of compact representation and abstraction, also of interest is the continuous spatiotemporal nature of the Stroke Surfaces in the IR. This provides a highly manipulable vector representation of video, akin to 2D vector graphics, which enables us to synthesise animations at any scale without pixelisation. Indeed many of the figures in this Chapter were rendered at a scale factor greater than unity to produce higher resolution images than could be captured from a standard PAL video frame. Future developments might investigate the use of temporal scaling to affect the frame rate of animations. 8.9 Summary and Discussion In this Chapter we have described a novel framework for synthesising temporally coherent non-photorealistic animations from video sequences. This framework comprises the third and final subsystem of the “Video Paintbox”, and may be combined with the previously described motion emphasis work to produce complete cartoon-style animations from video. Our rendering framework is unique among automated AR video methods in that we process video as a spatiotemporal voxel volume. Existing automated AR methods transform brush strokes independently between frames using a highly localised (per pixel, per frame) motion estimate. By contrast, in our system the decisions governing the rendering of a frame of animation are driven using information within a temporal window spanning instants before and after that frame. This higher level of temporal analysis allows us to smoothly vary attributes such as region or stroke colour over time, and allows us to create improved motion estimates of objects in the video. Spatially, we also operate at a higher level by manipulating video as distinct regions tracked over time, rather than individual pixels. This allows us to produce robust motion estimates for objects, and facilitates the synthesis of both region based (e.g. flat-shaded cartoon) and stroke based (e.g. traditional painterly) AR styles. For the latter, brush stroke motion is guaranteed to be consistent over entire regions — contradictory visual cues do not arise, for example where stroke motion differs within a given object. We have shown that our high level spatiotemporal approach results in improved aesthetics and temporal coherence in resulting animations, compared to the current state of the art. STROKE SURFACES: TEMPORALLY COHERENT A.R. ANIMATIONS FROM VIDEO 243 Much of the discussion of the relative merits of our approach over optical flow can be found in Section 8.6. We have demonstrated that automated rotoscoping, matting, and the extension of many “traditional” static AR styles to video, may be unified in a framework. Although we have experimented only with the extension of our own pointillist-style painterly method (Chapter 3) to video, we believe this framework to be sufficiently general to form the basis of a useful tool for the extension of further static stroke based AR techniques to video. The application of our framework to other static AR styles is perhaps the most easily exploitable direction for future work, though does not address the limitations of our technique, which we now discuss. Perhaps the most limiting assumption in our system is that video must be segmented into homogeneous regions in order to be parsed into the IR (and so subsequently rendered). As discussed in Section 8.6, certain classes of video (for example crowd scenes, or running water) do not readily lend themselves to segmentation, and so cause our method difficulty. Typically such scenes are under-segmented as large feature subvolumes, causing an unappealing loss of detail in the animation. This is not surprising; the segmentation of such scenes would be a difficult task even for a human observer. Thus although we are able to produce large improvements in the temporal coherence of many animations, our method is less generally applicable than optical based flow methods, which are able to operate on all classes of video — albeit with a lower degree of temporal coherence. The problem of compromising between a high level model for accuracy, and a lower level model for generality, is an issue that has repeatedly surfaced in this thesis, and we defer discussion of this matter to our conclusions in Part IV. However we summarise that as a consequence we view our method as an alternative, rather than a replacement, for optical flow based AR. The second significant limitation of our system stems from the use of homographies to estimate inter-frame motion from an object’s internal texture. We assume regions to be rigid bodies undergoing motion that is well modelled by a plane to plane transformation; in effect we assume objects in the video sequence may be approximated as planar surfaces. There are some situations where lack of internal texture can cause ambiguities to creep in to this model; for example if an object moves in-front of an untextured background, is that background static and being occluded, or is that background deforming around the foreground object? Currently we assume rigid bodies and so search for the best homography to account for the shape change of the background. The worst case outcome of poor motion modelling is a decrease in the temporal coherence of any markings or brush strokes within the interiors of objects. Other artistic styles (such as STROKE SURFACES: TEMPORALLY COHERENT A.R. ANIMATIONS FROM VIDEO 244 sketchy outlines or cartoon-style rendering) do not use the homography data in the IR, and so are unaffected. As a work-around we allow the user to set the motion models of video objects to be “stationary” if they deform in an undesirable manner. This single “point and click” corrective interaction is necessary to introduce additional knowledge into an under-constrained system, and is in line with the high level of creative interactive we desire with the animator. Future work might examine whether the planar surface assumption could be replaced by an improved model; perhaps a triangulated mesh, or replacement of the linear bases which form the plane with curvilinear bases (adapting the recent “kernel PCA” technique of [137]). However, many of the video sequences we have presented contain distinctly non-planar surfaces which nevertheless create aesthetically acceptable animations, exhibiting superior levels of temporal coherence than the current state of the art. We therefore question whether the additional effort in fitting more complex models would pay off in terms of rendering quality. We did not set out to produce a fully automated system — not only do we desire interaction with the Video Paintbox for creative reasons (setting high level parameters, etc.) but also, rarely, for the correction of the Computer Vision algorithms in the front end. The general segmentation problem precludes the possibility of segmenting any given video into semantically meaningfully parts. However we have kept the burden of correction low (Section 8.5). Users need only click on video objects once, for example to merge two over-segmented feature sub-volumes in the video, and those changes are propagated throughout the spatiotemporal video volume automatically. In practical terms, user correction is often unnecessary, but when needed takes no more than a couple of minutes of user time. This is in contrast to the hundreds of man hours required to correct the optical flow motion fields of contemporary video driven AR techniques [61]. A selection of source and rendered video clips have been included in Appendix C.

John Philip Collomosse | J. Collomosse

[1] Dorin Comaniciu,et al. Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2] A. Finkelstein,et al. Nonphotorealistic rendering , 2003, IEEE Computer Graphics and Applications.

[3] Donald E. Knuth,et al. The art of computer programming: sorting and searching (volume 3) , 1973 .

[4] Derek J. Paddon,et al. Perceptually Realistic Flower Generation , 2000, WSCG.

[5] Siu Chi Hsu,et al. Drawing and animation using skeletal strokes , 1994, SIGGRAPH.

[6] Eric Keppel,et al. Approximating Complex Surfaces by Triangulation of Contour Lines , 1975, IBM J. Res. Dev..

[7] Robert C. Bolles,et al. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[8] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Gregory D. Hager,et al. A Particle Filter without Dynamics for Robust 3D Face Tracking , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[10] Binh Pham. Expressive brush strokes , 1991, CVGIP Graph. Model. Image Process..

[11] G. Wyszecki,et al. Color Science Concepts and Methods , 1982 .

[12] Eric Daniels,et al. Deep canvas in Disney's Tarzan , 1999, SIGGRAPH '99.

[13] Herbert Freeman,et al. On the Encoding of Arbitrary Geometric Configurations , 1961, IRE Trans. Electron. Comput..

[14] Craig W. Reynolds. Flocks, herds, and schools: a distributed behavioral model , 1987, SIGGRAPH.

[15] David Salesin,et al. Rendering parametric surfaces in pen and ink , 1996, SIGGRAPH.

[16] John Lasseter,et al. Principles of traditional animation applied to 3D computer animation , 1987, SIGGRAPH.

[17] David R. Forsey,et al. How to Render Frames and Influence People , 1994, Comput. Graph. Forum.

[18] Peter Meer,et al. Synergism in low level vision , 2002, Object recognition supported by user interaction for service robots.

[19] Alan Watt,et al. 3D Computer Graphics , 1993 .

[20] Bruce Gooch,et al. Non-photorealistic rendering , 2001 .

[21] Bernd Jähne,et al. Spatio-Temporal Image Processing , 1993, Lecture Notes in Computer Science.

[22] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[23] Barbara J. Meier. Painterly rendering for animation , 1996, SIGGRAPH.

[24] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[25] John A. Nelder,et al. A Simplex Method for Function Minimization , 1965, Comput. J..

[26] Anil K. Jain,et al. Face Detection in Color Images , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[27] John P. Collomosse,et al. Cubist Style Rendering from Photographs , 2003, IEEE Trans. Vis. Comput. Graph..

[28] Mario Costa Sousa,et al. Observational Models of Graphite Pencil Materials , 2000, Comput. Graph. Forum.

[29] Zhiwei Zhu,et al. Real Time 3D Face Pose Tracking From an Uncalibrated Camera , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[30] Jean-Daniel Fekete,et al. TicTacToon: a paperless system for professional 2D animation , 1995, SIGGRAPH.

[31] David England,et al. Modelling the Texture of Paint , 1992, Comput. Graph. Forum.

[32] S. Ganapathy,et al. A new general triangulation method for planar contours , 1982, SIGGRAPH.

[33] Dario Maio,et al. Real-time face location on gray-scale static images , 2000, Pattern Recognition.

[34] Daniel Cohen-Or,et al. Volume graphics , 1993, Computer.

[35] Student,et al. THE PROBABLE ERROR OF A MEAN , 1908 .

[36] David Capel,et al. Image Mosaicing and Super-resolution , 2004, Distinguished Dissertations.

[37] G. Daniel,et al. Visualising Video Sequences using Direct Volume Rendering , 2003, VVG.

[38] John W. Buchanan,et al. Comprehensive Halftoning of 3D Scenes , 1999, Comput. Graph. Forum.

[39] Aaron Hertzmann,et al. Fast paint texture , 2002, NPAR '02.

[40] Yunhe Pan,et al. Advanced Design for a Realistic Virtual Brush , 2003, Comput. Graph. Forum.

[41] A. David Marshall,et al. Tracking people in three dimensions using a hierarchical model of dynamics , 2002, Image Vis. Comput..

[42] Thierry Pudet,et al. Real Time Fitting of Hand‐Sketched Pressure Brushstrokes , 1994, Comput. Graph. Forum.

[43] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[44] Jim R. Parker,et al. Algorithms for image processing and computer vision , 1996 .

[45] William V. Baxter,et al. DAB: Interactive Haptic Painting with 3D Virtual Brushes , 2001, SIGGRAPH Courses.

[46] Adam Finkelstein,et al. Real-time hatching , 2001, SIGGRAPH.

[47] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[48] W. Davis van Bakergem,et al. Free Hand Plotting Is It Live or Is It Digital ? , 2003 .

[49] Ali M. S. Zalzala,et al. An evolutionary algorithm for collision free motion planning of multi-arm robots , 1995 .

[50] William J. Christmas,et al. Building Classifier Ensembles for Automatic Sports Classification , 2003, Multiple Classifier Systems.

[51] Aseem Agarwala,et al. SnakeToonz: a semi-automatic approach to creating cel animation from video , 2002, NPAR '02.

[52] Steve Strassmann,et al. Hairy brushes , 1986, SIGGRAPH.

[53] Michael Isard,et al. CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[54] Christos Faloutsos,et al. VideoCube: A Novel Tool for Video Mining and Classification , 2002, ICADL.

[55] Volker Rehrmann. Stabile, echtzeitfähige Farbbildauswertung , 1994 .

[56] Levente Kovks,et al. Creating Animations Combining Stochastic Paintbrush Transformation and Motion Detection , 2002 .

[57] Yunhe Pan,et al. A Solid Model Based Virtual Hairy Brush , 2002, Comput. Graph. Forum.

[58] Michio Shiraishi,et al. An algorithm for automatic painterly rendering based on local source image approximation , 2000, NPAR '00.

[59] Marcin Szymanski,et al. Simulating cartoon style animation , 2002, NPAR '02.

[60] Lee Markosian,et al. Artistic silhouettes: a hybrid approach , 2000, NPAR '00.

[61] Peter Litwinowicz,et al. Processing images and video for an impressionist effect , 1997, SIGGRAPH.

[62] Ching Y. Suen,et al. A fast parallel algorithm for thinning digital patterns , 1984, CACM.

[63] Milan Sonka,et al. Image Processing, Analysis and Machine Vision , 1993, Springer US.

[64] Abdelwaheb Marzouki,et al. Estimation of generalized mixtures and its application in image segmentation , 1997, IEEE Trans. Image Process..

[65] Ivan E. Sutherland,et al. Sketchpad a Man-Machine Graphical Communication System , 1899, Outstanding Dissertations in the Computer Sciences.

[66] Richard Szeliski,et al. Image mosaicing for tele-reality applications , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[67] Steven M. Seitz,et al. Frontiers in 3D Photography: Reflectance and Motion , 2003, International Conference on Vision, Video and Graphics.

[68] Gunther Wyszecki,et al. Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Edition , 2000 .

[69] R. Hughes. The Shock of the New , 1983 .

[70] Ioannis A. Kakadiaris,et al. Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[71] Wolfgang Leister. Computer Generated Copper Plates , 1994, Comput. Graph. Forum.

[72] Robert L. Cook,et al. Distributed ray tracing , 1984, SIGGRAPH.

[73] Andy Harter,et al. Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[74] Przemyslaw Prusinkiewicz,et al. A Few Good Lines: Suggestive Drawing of 3D Models , 2003, Comput. Graph. Forum.

[75] N Weisstein,et al. Flicker induces depth: Spatial and temporal factors in the perceptual segregation of flickering and nonflickering regions in depth , 1984, Perception & psychophysics.

[76] Satoshi Matsuoka,et al. Teddy: A Sketching Interface for 3D Freeform Design , 1999, SIGGRAPH Courses.

[77] Demetri Terzopoulos,et al. Snakes: Active contour models , 2004, International Journal of Computer Vision.

[78] ISAAC COHEN,et al. Using deformable surfaces to segment 3-D images and infer differential structures , 1992, CVGIP Image Underst..

[79] Henry Fuchs,et al. Optimal surface reconstruction from planar contours , 1977, CACM.

[80] Timothy F. Cootes,et al. Locating Salient Object Features , 1998, BMVC.

[81] Karl Sims,et al. Evolving virtual creatures , 1994, SIGGRAPH.

[82] Ken Perlin,et al. Painterly rendering for video and interaction , 2000, NPAR '00.

[83] David Salesin,et al. Computer-generated watercolor , 1997, SIGGRAPH.

[84] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .

[85] Francisco Herrera,et al. Learning with Genetic Algorithms , 2001 .

[86] Douglas DeCarlo,et al. Abstracted painterly renderings using eye-tracking data , 2002, NPAR '02.

[87] Takafumi Saito,et al. Comprehensible rendering of 3-D shapes , 1990, SIGGRAPH.

[88] Alvy Ray Smith,et al. Plants, fractals, and formal languages , 1984, SIGGRAPH.

[89] Mark S. Nixon,et al. Feature Extraction and Image Processing , 2002 .

[90] Lance Williams,et al. Pyramidal parametrics , 1983, SIGGRAPH.

[91] A. Armstrong,et al. Direct DCT indexing using genetic algorithm concepts , 2002, Proceedings 20th Eurographics UK Conference.

[92] Harry Shum,et al. Stylizing motion with drawings , 2003, SCA '03.

[93] Joe Marks,et al. Spacetime constraints revisited , 1993, SIGGRAPH.

[94] Cassidy J. Curtis. Loose and sketchy animation , 1998, International Conference on Computer Graphics and Interactive Techniques.

[95] David Salesin,et al. Interactive pen-and-ink illustration , 1994, SIGGRAPH.

[96] Christoph Bregler,et al. Turning to the masters: motion capturing cartoons , 2002, ACM Trans. Graph..

[97] Donald E. Knuth,et al. The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .

[98] Mario Costa Sousa,et al. Computer‐Generated Graphite Pencil Rendering of 3D Polygonal Models , 1999, Comput. Graph. Forum.

[99] David E. Goldberg,et al. Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[100] John P. Collomosse,et al. Cartoon-Style Rendering of Motion from Video , 2003, VVG.

[101] Jia-Ping Wang,et al. Stochastic Relaxation on Partitions With Connected Components and Its Application to Image Segmentation , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[102] Aaron Hertzmann,et al. Painterly rendering with curved brush strokes of multiple sizes , 1998, SIGGRAPH.

[103] John Collomosse,et al. Painterly rendering using image salience , 2002, Proceedings 20th Eurographics UK Conference.

[104] David A. Forsyth,et al. Finding Naked People , 1996, ECCV.

[105] Jake K. Aggarwal,et al. Tracking human motion using multiple cameras , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[106] Lee Markosian,et al. Real-time nonphotorealistic rendering , 1997, SIGGRAPH.

[107] Hans P. Moravec. Obstacle avoidance and navigation in the real world by a seeing robot rover , 1980 .

[108] B. S. Manjunath,et al. Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[109] Josef Kittler,et al. The Adaptive Hough Transform , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[110] Richard Williams,et al. The Animator's Survival Kit , 2001 .

[111] E. Catmull,et al. A CLASS OF LOCAL INTERPOLATING SPLINES , 1974 .

[112] Adam Finkelstein,et al. Suggestive contours for conveying shape , 2003, ACM Trans. Graph..

[113] M. N. Zakaria,et al. Interactive Evolutionary Approach to Character Animation , 2001, WSCG.

[114] David Salesin,et al. Computer-generated pen-and-ink illustration , 1994, SIGGRAPH.

[115] Mario Costa Sousa,et al. Observational Model of Blenders and Erasers in Computer-Generated Pencil Rendering , 1999, Graphics Interface.

[116] Ken Perlin,et al. Live paint: painting with procedural multiscale textures , 1995, SIGGRAPH.

[117] J. Russell,et al. The psychology of facial expression: Frontmatter , 1997 .

[118] Wen Tang,et al. Intelligent self-learning characters for computer games , 2002, Proceedings 20th Eurographics UK Conference.

[119] David Salesin,et al. Image Analogies , 2001, SIGGRAPH.

[120] R. Weale. Vision. A Computational Investigation Into the Human Representation and Processing of Visual Information. David Marr , 1983 .

[121] Theodosios Pavlidis,et al. Picture Segmentation by a Tree Traversal Algorithm , 1976, JACM.

[122] Jitendra Malik,et al. Color- and texture-based image segmentation using EM and its application to content-based image retrieval , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[123] Jintae Lee. Physically-based modeling of brush painting , 1997, Comput. Networks ISDN Syst..

[124] Hungwen Li,et al. Fast Hough transform: A hierarchical approach , 1986, Comput. Vis. Graph. Image Process..

[125] J. Andrew Bangham,et al. The Art of Scale-Space , 2003, BMVC.

[126] Michael Gleicher,et al. HijackGL: reconstructing from streams for stylized rendering , 2002, NPAR '02.

[127] Mubarak Shah,et al. A Fast algorithm for active contours and curvature estimation , 1992, CVGIP Image Underst..

[128] Michael Isard,et al. Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[129] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[130] Gershon Elber,et al. Interactive Line Art Rendering of Freeform Surfaces , 1999, Comput. Graph. Forum.

[131] Gershon Elber,et al. Line Art Illustrations of Parametric and Implicit Forms , 1998, IEEE Trans. Vis. Comput. Graph..

[132] Turner Whitted,et al. Anti-aliased line drawing using brush extrusion , 1983, SIGGRAPH.

[133] Masayuki Nakajima,et al. Volumetric modeling of colored pencil drawing , 1999, Proceedings. Seventh Pacific Conference on Computer Graphics and Applications (Cat. No.PR00293).

[134] David Salesin,et al. Orientable textures for image-based pen-and-ink illustration , 1997, SIGGRAPH.

[135] Ioannis A. Kakadiaris,et al. 3D human body model acquisition from multiple views , 1995, Proceedings of IEEE International Conference on Computer Vision.

[136] Oliver Deussen,et al. Floating Points: A Method for Computing Stipple Drawings , 2000, Comput. Graph. Forum.

[137] Peter Shirley,et al. Artistic Vision: painterly rendering using computer vision techniques , 2002, NPAR '02.

[138] Demetri Terzopoulos,et al. Artificial fishes: physics, locomotion, perception, behavior , 1994, SIGGRAPH.

[139] Victor Ostromoukhov. Digital facial engraving , 1999, SIGGRAPH '99.

[140] Han-Wei Shen,et al. Chronovolumes: A Direct Rendering Technique for Visualizing Time-Varying Data , 2003, VG.

[141] David B. Fogel,et al. Inductive reasoning and bounded rationality reconsidered , 1999, IEEE Trans. Evol. Comput..

[142] John Collomosse,et al. A trainable low-level feature detector , 2004, ICPR 2004.

[143] Jake K. Aggarwal,et al. Tracking human motion in an indoor environment , 1995, Proceedings., International Conference on Image Processing.

[144] A. Mehrabian. Pleasure-arousal-dominance: A general framework for describing and measuring individual differences in Temperament , 1996 .

[145] Atreyi Kankanhalli,et al. Automatic partitioning of full-motion video , 1993, Multimedia Systems.

[146] Leonard McMillan,et al. Proscenium: a framework for spatio-temporal video editing , 2003, ACM Multimedia.

[147] Paul Haeberli,et al. Paint by numbers: abstract image representations , 1990, SIGGRAPH.

[148] Stephen M. Smith,et al. SUSAN—A New Approach to Low Level Image Processing , 1997, International Journal of Computer Vision.

[149] Tamás Szirányi,et al. Random paintbrush transformation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[150] O. Reiser,et al. Principles Of Gestalt Psychology , 1936 .