论文信息 - Authoring effective depictions of reality by combining multiple samples of the plenoptic function

Authoring effective depictions of reality by combining multiple samples of the plenoptic function

Cameras are powerful tools for depicting the world around us, but the images they produce are interpretations of reality that often fail to resemble our intended interpretation. While photographs provide a strong illusion of realism, they are not always effective at depicting what the author hoped to depict. The recent digitization of photography and video offers the opportunity to move beyond the traditional constraints of analog photography and improve the techniques we use to create depictions from light in a scene. In this thesis, I explore one approach to addressing this challenge. My approach begins by capturing multiple digital photographs and videos of a scene. I present algorithms and interfaces that identify the best pieces of this captured imagery, as well as algorithms for seamlessly fusing these pieces into new depictions that are better than any of the originals. As I show in this thesis, the results are often much more effective than what could be achieved using traditional photographic techniques. I apply this approach to three projects in particular. The "photomontage" system is an interactive tool that partially automates the process of combining the best features of a stack of images. The user authors a composite image by specifying high-level objectives that the composite should exhibit; the system then constructs a depiction from pieces of the input images by maximizing these objectives while also minimizing visual artifacts. In the next project, I extend the photomontage approach to handle sequences of photographs captured from shifted viewpoints. The output is a "multi-viewpoint panorama" that is useful for depicting scenes too long to effectively image with the single-viewpoint perspective produced by a traditional camera. In the final project, I extend my approach to video sequences. I introduce the "panoramic video texture," which is a video with a wide field of view that appears to play continuously and indefinitely. The result is a new medium that combines the benefits of panoramic photography and video to provide a more immersive depiction of a scene. The key challenge in this project is that panoramic video textures are created from the input of a single video camera that slowly pans across the scene; thus, although only a portion of the scene has been imaged at any given time, the output must simultaneously portray motion throughout the scene. Throughout this thesis, I demonstrate how these novel techniques can be used to create expressive visual media that more effectively depicts reality.

David Salesin | Aseem Agarwala | A. Agarwala | D. Salesin

[1] Wilfried Linder,et al. Digital Photogrammetry , 2003 .

[2] Maneesh Agrawala,et al. Non-invasive interactive visualization of dynamic architectural environments , 2003, I3D '03.

[3] Wolfgang Heidrich,et al. High dynamic range display systems , 2004, SIGGRAPH 2004.

[4] Beaumont Newhall,et al. The history of photography from 1839 to the present day , 1949 .

[5] Alexei A. Efros,et al. Fast bilateral filtering for the display of high-dynamic-range images , 2002 .

[6] J. Bennett. Vision and Art: The Biology of Seeing , 2003 .

[7] Michael Elad,et al. Restoration of a single superresolution image from several blurred, noisy, and undersampled measured images , 1997, IEEE Trans. Image Process..

[8] Holly E. Rushmeier,et al. Tone reproduction for realistic images , 1993, IEEE Computer Graphics and Applications.

[9] Harry Shum,et al. Bayesian Correction of Image Intensity with Spatial Consideration , 2004, ECCV.

[10] David Salesin,et al. Multiperspective panoramas for cel animation , 1997, SIGGRAPH.

[11] Marc Levoy,et al. Light field rendering , 1996, SIGGRAPH.

[12] Richard Woodfield. Review: John Willats 'Art and representation: new principles in the analysis of pictures' , 1999 .

[13] M. Landy,et al. The Plenoptic Function and the Elements of Early Vision , 1991 .

[14] James Davis,et al. Mosaics of scenes with moving objects , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[15] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] William Ebenstein. Two ways of life , 1962 .

[17] Takeo Kanade,et al. Virtual ized reality: constructing time-varying virtual worlds from real world events , 1997 .

[18] Maneesh Agrawala,et al. Artistic Multiprojection Rendering , 2000, Rendering Techniques.

[19] Ulrich Neumann,et al. Immersive panoramic video , 2000, ACM Multimedia.

[20] N. Ahuja,et al. Seamless video editing , 2004, ICPR 2004.

[21] Leonard McMillan,et al. General Linear Cameras , 2004, ECCV.

[22] J. Astola,et al. Vector median filters , 1990, Proc. IEEE.

[23] Shree K. Nayar,et al. Radiometric self calibration , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[24] Richard Szeliski,et al. Creating full view panoramic image mosaics and environment maps , 1997, SIGGRAPH.

[25] Richard Szeliski,et al. Eliminating ghosting and exposure artifacts in image mosaics , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[26] Michal Irani,et al. Video indexing based on mosaic representations , 1998, Proc. IEEE.

[27] B. Ripley,et al. Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[28] Pat Hanrahan,et al. Conveying shape and features with image-based relighting , 2003, IEEE Visualization, 2003. VIS 2003..

[29] Alexei A. Efros,et al. Automatic photo pop-up , 2005, ACM Trans. Graph..

[30] Michael F. Cohen,et al. Digital photography with flash and no-flash image pairs , 2004, ACM Trans. Graph..

[31] David Salesin,et al. Interactive digital photomontage , 2004, SIGGRAPH 2004.

[32] J. O'Regan,et al. Solving the "real" mysteries of visual perception: the world as an outside memory. , 1992, Canadian journal of psychology.

[33] Guillermo Sapiro,et al. Image inpainting , 2000, SIGGRAPH.

[34] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[35] Rajiv Gupta,et al. Linear pushbroom cameras , 1997 .

[36] H. Collewijn,et al. Binocular retinal image motion during active head rotation , 1980, Vision Research.

[37] Daniel Cohen-Or,et al. Action synopsis: pose selection and illustration , 2005, ACM Trans. Graph..

[38] David Salesin,et al. Multiresolution video , 1996, SIGGRAPH.

[39] Richard Szeliski,et al. High-quality Image-based Interactive Exploration of Real-World Environments 1 , 2003 .

[40] Patrick Pérez,et al. Poisson image editing , 2003, ACM Trans. Graph..

[41] Jerry N. Uelsmann. Jerry Uelsmann: Photo Synthesis , 1992 .

[42] Steven M. Seitz,et al. Multiperspective Imaging , 2003, IEEE Computer Graphics and Applications.

[43] Rama Chellappa,et al. Performance Characterization of Image Stabilization Algorithms , 1996, Real Time Imaging.

[44] Shree K. Nayar,et al. Catadioptric omnidirectional camera , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45] Denis Simakov,et al. Space-time scene manifolds , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[46] Michael Bosse,et al. Non-metric image-based rendering for video stabilization , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[47] Shree K. Nayar,et al. What Can Be Known about the Radiometric Response from Images? , 2002, ECCV.

[48] A. Fitzgibbon. Stochastic rigidity: image registration for nowhere-static scenes , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[49] William T. Freeman,et al. Learning low-level vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[50] J. Willats. Art and representation : new principles in the analysis of pictures , 1999 .

[51] Ramesh Raskar,et al. Non-photorealistic camera: depth edge detection and stylized rendering using multi-flash imaging , 2004, SIGGRAPH 2004.

[52] Shree K. Nayar,et al. Motion deblurring using hybrid imaging , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[53] Keith J. Hanna,et al. Hybrid stereo camera: an IBR approach for synthesis of very high resolution stereoscopic image sequences , 2001, SIGGRAPH.

[54] Ramesh Raskar,et al. Removing photography artifacts using gradient projection and flash-exposure sampling , 2005, SIGGRAPH 2005.

[55] P. Anandan,et al. Hierarchical Model-Based Motion Estimation , 1992, ECCV.

[56] Shenchang Eric Chen,et al. QuickTime VR: an image-based approach to virtual environment navigation , 1995, SIGGRAPH.

[57] William T. Freeman,et al. Shape-time photography , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[58] David Salesin,et al. Animating pictures with stochastic motion textures , 2005, SIGGRAPH 2005.

[59] Marc Levoy,et al. Interactive design of multi-perspective images for visualizing urban landscapes , 2004, IEEE Visualization 2004.

[60] H. Edgerton,et al. Stopping Time: The Photographs of Harold Edgerton , 1987 .

[61] H. Barrow,et al. RECOVERING INTRINSIC SCENE CHARACTERISTICS FROM IMAGES , 1978 .

[62] Shree K. Nayar,et al. Generalized Mosaicing : High Dynamic Range in a Wide Field of View 247 , 2001 .

[63] Ren Ng. Fourier Slice Photography , 2005 .

[64] Daphna Weinshall,et al. Mosaicing New Views: The Crossed-Slits Projection , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[65] A. Torralba,et al. Motion magnification , 2005, SIGGRAPH 2005.

[66] Jiang Yu Zheng. Digital Route Panoramas , 2003, IEEE Multim..

[67] Bruce Gooch,et al. Non-photorealistic rendering , 2001 .

[68] Marc Levoy,et al. High performance imaging using large camera arrays , 2005, SIGGRAPH 2005.

[69] Paul Joyce,et al. Hockney on Photography: Conversations with Paul Joyce , 1988 .

[70] Marc Levoy,et al. Synthetic aperture confocal imaging , 2004, SIGGRAPH 2004.

[71] Richard Szeliski,et al. High-quality video view interpolation using a layered representation , 2004, SIGGRAPH 2004.

[72] Scott Mutter. Surrational Images: PHOTOMONTAGES , 1992 .

[73] Jan Krikke. Axonometry: A Matter of Perspective , 2000, IEEE Computer Graphics and Applications.

[74] Michal Irani,et al. Improving resolution by image registration , 1991, CVGIP Graph. Model. Image Process..

[75] Shree K. Nayar,et al. High dynamic range imaging: spatially varying pixel exposures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[76] Harry Shum,et al. Review of image-based rendering techniques , 2000, Visual Communications and Image Processing.

[77] Olga Veksler,et al. Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[78] Irfan A. Essa,et al. Graphcut textures: image and video synthesis using graph cuts , 2003, ACM Trans. Graph..

[79] Ken-ichi Anjyo,et al. Tour into the picture: using a spidery mesh interface to make animation from a single image , 1997, SIGGRAPH.

[80] Andrew S. Glassner,et al. Cubism and Cameras: Free-form Optics for Computer Graphics , 2000 .

[81] Henri Cartier-Bresson,et al. The decisive moment , 1952 .

[82] Steven M. Seitz,et al. The Space of All Stereo Images , 2004, International Journal of Computer Vision.

[83] Eadweard Muybridge,et al. The Human Figure in Motion , 1955 .

[84] E. Reinhard. Photographic Tone Reproduction for Digital Images , 2002 .

[85] Frédo Durand,et al. Flash photography enhancement via intrinsic relighting , 2004, SIGGRAPH 2004.

[86] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[87] Walter Bender,et al. Salient Stills: Process and Practice , 1996, IBM Syst. J..

[88] William A. Barrett,et al. Intelligent scissors for image composition , 1995, SIGGRAPH.

[89] Dani Lischinski,et al. Gradient Domain High Dynamic Range Compression , 2023 .

[90] Steven M. Seitz,et al. Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[91] David Salesin,et al. Video matting of complex scenes , 2002, SIGGRAPH.

[92] P. Danielsson. Euclidean distance mapping , 1980 .

[93] H. Barlow. Vision Science: Photons to Phenomenology by Stephen E. Palmer , 2000, Trends in Cognitive Sciences.

[94] Robert L. Stevenson,et al. Extraction of high-resolution frames from video sequences , 1996, IEEE Trans. Image Process..

[95] Richard Szeliski,et al. The lumigraph , 1996, SIGGRAPH.

[96] Andrew W. Fitzgibbon,et al. Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[97] David Salesin,et al. Keyframe-based tracking for rotoscoping and animation , 2004, ACM Trans. Graph..

[98] Matthew A. Brown,et al. Recognising panoramas , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[99] David Salesin,et al. Panoramic video textures , 2005, ACM Trans. Graph..

[100] Richard Szeliski,et al. Handling occlusions in dense multi-view stereo , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[101] Bronwen Brown. Secret Knowledge: Rediscovering the Lost Techniques of the Old Masters , 2002 .

[102] David Salesin,et al. Schematic storyboarding for video visualization and editing , 2006, SIGGRAPH 2006.

[103] Yair Weiss,et al. Deriving intrinsic images from image sequences , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[104] Seth Teller,et al. Video matching , 2004, SIGGRAPH 2004.

[105] Shmuel Peleg,et al. Mosaicing on Adaptive Manifolds , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[106] Steve Mann,et al. ON BEING `UNDIGITAL' WITH DIGITAL CAMERAS: EXTENDING DYNAMIC RANGE BY COMBINING DIFFERENTLY EXPOSED PICTURES , 1995 .

[107] M. Kasser,et al. Digital photogrammetry , 2001 .

[108] M. Holly. The psychology of perspective and renaissance art , 1989 .

[109] Jitendra Malik,et al. Recovering high dynamic range radiance maps from photographs , 1997, SIGGRAPH '08.

[110] Elaine Cohen,et al. A non-photorealistic lighting model for automatic technical illustration , 1998, SIGGRAPH.

[111] David Salesin,et al. Photographing long scenes with multi-viewpoint panoramas , 2006, SIGGRAPH 2006.

[112] William T. Freeman,et al. Removing camera shake from a single photograph , 2006, SIGGRAPH 2006.

[113] Shree K. Nayar,et al. Adaptive dynamic range imaging: optical control of pixel exposures over space and time , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[114] Dani Lischinski,et al. Dynamosaics: video mosaics with non-chronological time , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[115] Bruce Bridgeman,et al. A theory of visual stability across saccadic eye movements , 1994, Behavioral and Brain Sciences.

[116] Denis Zorin,et al. Correction of geometric perceptual distortions in pictures , 1995, SIGGRAPH.

[117] Shmuel Peleg,et al. Mosaicing with Parallax using Time Warping , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[118] Shmuel Peleg,et al. Seamless Image Stitching in the Gradient Domain , 2004, ECCV.

[119] Richard Szeliski,et al. High dynamic range video , 2003, ACM Trans. Graph..

[120] F. Hall,et al. Photo point monitoring handbook—Part A: Field procedures; Part B: Concepts and analysis. , 2002 .

[121] Beaumont Newhall,et al. 写真の歴史 = The history of photography , 1974 .

[122] J. Cutting. Representing Motion in a Static Image: Constraints and Parallels in Art, Science, and Popular Culture , 2002, Perception.

[123] Frédo Durand,et al. A gentle introduction to bilateral filtering and its applications , 2007, SIGGRAPH Courses.

[124] H. P. Robinson,et al. Pictorial effect in photography, being hints on composition and chiaroscuro for photographers , 2022 .

[125] Shree K. Nayar,et al. Jitter camera: high resolution video from a low resolution detector , 2004, CVPR 2004.

[126] Paul Rademacher,et al. Multiple-center-of-projection images , 1998, SIGGRAPH.

[127] Ramesh Raskar,et al. Image fusion for context enhancement and video surrealism , 2005, SIGGRAPH 2005.

[128] Sebastian Thrun,et al. Robotic mapping: a survey , 2003 .

[129] Leonard McMillan,et al. A Framework for Multiperspective Rendering , 2004, Rendering Techniques.

[130] Cindy Grimm,et al. Tabletop Computed Lighting for Practical Digital Photography , 2007 .

[131] E. Gombrich. Art and Illusion , 1962 .

[132] Matthew A. Brown,et al. Unsupervised 3D object recognition and reconstruction in unordered datasets , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[133] Marta Braun,et al. Picturing Time: The Work of Etienne-Jules Marey (1830-1904) , 1995 .

[134] L. McMillan,et al. Video enhancement using per-pixel virtual exposures , 2005, SIGGRAPH 2005.

[135] Harry Shum,et al. Image completion with structure propagation , 2005, ACM Trans. Graph..

[136] Takeo Kanade,et al. Virtualized Reality: Constructing Virtual Worlds from Real Scenes , 1997, IEEE Multim..

[137] Richard Szeliski,et al. Image-based interactive exploration of real-world environments , 2004, IEEE Computer Graphics and Applications.

[138] Carl D. Meyer,et al. Matrix Analysis and Applied Linear Algebra , 2000 .

[139] Takeo Kanade,et al. An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[140] Michael Bosse,et al. Unstructured lumigraph rendering , 2001, SIGGRAPH.

[141] Peter J. Burt,et al. Enhanced image capture through fusion , 1993, 1993 (4th) International Conference on Computer Vision.

[142] Don Kimber,et al. FlyAbout: spatially indexed panoramic video , 2001, MULTIMEDIA '01.

[143] Richard Szeliski,et al. Video textures , 2000, SIGGRAPH.