Big Data Analysis for Media Production

A typical high-end film production generates several terabytes of data per day, either as footage from multiple cameras or as background information regarding the set (laser scans, spherical captures, etc). This paper presents solutions to improve the integration of the multiple data sources, and understand their quality and content, which are useful both to support creative decisions on-set (or near it) and enhance the postproduction process. The main cinema specific contributions, tested on a multisource production dataset made publicly available for research purposes, are the monitoring and quality assurance of multicamera set-ups, multisource registration and acceleration of 3-D reconstruction, anthropocentric visual analysis techniques for semantic content annotation, and integrated 2-D-3-D web visualization tools. We discuss as well improvements carried out in basic techniques for acceleration, clustering and visualization, which were necessary to deal with the very large multisource data, and can be applied to other big data problems in diverse application fields.

[1]  Kiriakos N. Kutulakos,et al.  Linear Sequence-to-Sequence Alignment , 2004, CVPR.

[2]  Afzal Godil,et al.  Evaluation of 3D interest point detection techniques via human-generated ground truth , 2012, The Visual Computer.

[3]  Peter Kovesi,et al.  Automatic Sensor Placement from Vision Task Requirements , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  M. Pollefeys,et al.  VIDEO SYNCHRONIZATION VIA SPACE-TIME INTEREST POINT DISTRIBUTION , 2004 .

[5]  Ammar Belatreche,et al.  An experimental evaluation of novelty detection methods , 2014, Neurocomputing.

[6]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[7]  Shi-Min Hu,et al.  Visual storylines: Semantic visualization of movie sequence , 2012, Comput. Graph..

[8]  Stefan Decker,et al.  A dual-mode user interface for accessing 3D content on the world wide web , 2012, WWW.

[9]  Viorela Ila,et al.  Cache efficient implementation for block matrix operations , 2013, SpringSim.

[10]  Rémi Ronfard,et al.  A survey of vision-based methods for action representation, segmentation and recognition , 2011, Comput. Vis. Image Underst..

[11]  David A. Clifton,et al.  A review of novelty detection , 2014, Signal Process..

[12]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[13]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Alexandros Iosifidis,et al.  On the kernel Extreme Learning Machine classifier , 2015, Pattern Recognit. Lett..

[15]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[16]  Tarald O. Kvålseth,et al.  Entropy and Correlation: Some Comments , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[17]  Guillaume Lavoué,et al.  Streaming compressed 3D data on the web using JavaScript and WebGL , 2013, Web3D '13.

[18]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[19]  Josep Blat,et al.  3D graphics on the web: A survey , 2014, Comput. Graph..

[20]  Li Wang,et al.  Discriminative human action segmentation and recognition using semi-Markov model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Anastasios Tefas,et al.  A distributed framework for trimmed Kernel k-Means clustering , 2015, Pattern Recognit..

[23]  Federico Tombari,et al.  Performance Evaluation of 3D Keypoint Detectors , 2012, International Journal of Computer Vision.

[24]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[25]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[26]  Henrique C. Freitas,et al.  Parallel and distributed kmeans to identify the translation initiation site of proteins , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[27]  Divyakant Agrawal,et al.  Big data and cloud computing: current state and future opportunities , 2011, EDBT/ICDT '11.

[28]  Sethuraman Panchanathan,et al.  Gesture segmentation in complex motion sequences , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[29]  Alexandros Iosifidis,et al.  Minimum Class Variance Extreme Learning Machine for Human Action Recognition , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[30]  Pramod P. Khargonekar,et al.  Fast SVM training using approximate extreme points , 2013, J. Mach. Learn. Res..

[31]  Jorge Dias,et al.  Trajectory-based human action segmentation , 2015, Pattern Recognit..

[32]  Nico Blodow,et al.  Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[33]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[34]  Hong Jia,et al.  Cooperative and penalized competitive learning with application to kernel-based clustering , 2014, Pattern Recognit..

[35]  Dan Schonfeld,et al.  View-invariant motion trajectory-based activity classification and recognition , 2006, Multimedia Systems.

[36]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[37]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[39]  J. Navarro-Pedreño Numerical Methods for Least Squares Problems , 1996 .

[40]  Ivor W. Tsang,et al.  Improved Nyström low-rank approximation and error analysis , 2008, ICML '08.

[41]  Ioannis Stamos,et al.  Integrating Automated Range Registration with Multiview Geometry for the Photorealistic Modeling of Large-Scale Scenes , 2008, International Journal of Computer Vision.

[42]  Pavel Zemcík,et al.  Fast covariance recovery in incremental nonlinear least square solvers , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[43]  Bernhard Schölkopf,et al.  Regularization Networks and Support Vector Machines , 2000 .

[44]  Adrian Hilton,et al.  Wand-based Multiple Camera Studio Calibration , 2007 .

[45]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[46]  Torsten Sattler,et al.  Improving Image-Based Localization by Active Correspondence Search , 2012, ECCV.

[47]  Adrian Hilton,et al.  Influence of Colour and Feature Geometry on Multi-modal 3D Point Clouds Data Registration , 2014, 2014 2nd International Conference on 3D Vision.

[48]  Mohammed Bennamoun,et al.  A Comprehensive Performance Evaluation of 3D Local Feature Descriptors , 2015, International Journal of Computer Vision.

[49]  Ioannis Stamos,et al.  Integration of range and image sensing for photo-realistic 3D modeling , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[50]  Alexandros Iosifidis,et al.  Video summarization based on Subclass Support Vector Data Description , 2014, 2014 IEEE Symposium on Computational Intelligence for Engineering Solutions (CIES).

[51]  Chun,et al.  WebGL Models: End-to-End , 2012 .

[52]  Jean-Yves Guillemaut,et al.  Through-the-Lens Multi-camera Synchronisation and Frame-Drop Detection for 3D Reconstruction , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[53]  Ian D. Reid,et al.  Video synchronization from human motion using rank constraints , 2009, Comput. Vis. Image Underst..

[54]  Pavel Zemcík,et al.  Efficient implementation for block matrix operations for nonlinear least squares problems in robotic applications , 2013, 2013 IEEE International Conference on Robotics and Automation.

[55]  V. Chvátal A combinatorial theorem in plane geometry , 1975 .

[56]  John W. Fisher,et al.  Automatic registration of LIDAR and optical images of urban scenes , 2009, CVPR.

[57]  Viorela Ila,et al.  Fast sparse matrix multiplication on GPU , 2015, SpringSim.

[58]  Stefan Wagner,et al.  Fast delivery of 3D web content: a case study , 2013, Web3D '13.

[59]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[60]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[61]  Lior Wolf,et al.  Wide Baseline Matching between Unsynchronized Video Sequences , 2006, International Journal of Computer Vision.

[62]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[63]  Joseph L. Mundy,et al.  An Evaluation of Local Shape Descriptors in Probabilistic Volumetric Scenes , 2012, BMVC.

[64]  Sergei Vassilvitskii,et al.  A model of computation for MapReduce , 2010, SODA '10.

[65]  Zaïd Harchaoui,et al.  Kernel Change-point Analysis , 2008, NIPS.

[66]  Benjamin Moseley,et al.  Fast clustering using MapReduce , 2011, KDD.

[67]  Adrian Hilton,et al.  3D Scene Reconstruction from Multiple Spherical Stereo Pairs , 2013, International Journal of Computer Vision.

[68]  Hansung Kim,et al.  Fast and Accurate Refinement Method for 3D Reconstruction from Stereo Spherical Images , 2015, VISAPP.

[69]  Alun Evans,et al.  Web-based visualisation of on-set point cloud data , 2014, CVMP.

[70]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[71]  Hans-Peter Seidel,et al.  Markerless Motion Capture with unsynchronized moving cameras , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[72]  Timothy A. Davis,et al.  Direct Methods for Sparse Linear Systems (Fundamentals of Algorithms 2) , 2006 .

[73]  Li Wang,et al.  Human Action Segmentation and Recognition Using Discriminative Semi-Markov Models , 2011, International Journal of Computer Vision.

[74]  C.-C. Jay Kuo,et al.  Technologies for 3D mesh compression: A survey , 2005, J. Vis. Commun. Image Represent..

[75]  Young-Sik Choi,et al.  Least squares one-class support vector machine , 2009, Pattern Recognit. Lett..

[76]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[77]  Stefan Decker,et al.  On the design of a Dual-Mode User Interface for accessing 3D content on the World Wide Web , 2013, Int. J. Hum. Comput. Stud..

[78]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[79]  Xuan Xie,et al.  Automatic registration of fused lidar/digital imagery (texel images) for three-dimensional image creation , 2014 .

[80]  Richard Szeliski,et al.  Towards Internet-scale multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[81]  Luiz Affonso Guedes,et al.  The Coverage Problem in Video-Based Wireless Sensor Networks: A Survey , 2010, Sensors.

[82]  Alessandro Giua,et al.  Guest Editorial , 2001, Discrete event dynamic systems.

[83]  Pavel Zemcík,et al.  Incremental Block Cholesky Factorization for Nonlinear Least Squares in Robotics , 2013, Robotics: Science and Systems.

[84]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[85]  Matti Pietikäinen,et al.  Performance evaluation of texture measures with classification based on Kullback discrimination of distributions , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[86]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[87]  Marc Alexa,et al.  Combining Time-Of-Flight depth and stereo images without accurate extrinsic calibration , 2008, Int. J. Intell. Syst. Technol. Appl..

[88]  Yoichi Sato,et al.  Real-Time Fingertip Tracking and Gesture Recognition , 2002, IEEE Computer Graphics and Applications.

[89]  Jean-Yves Guillemaut,et al.  Outdoor Dynamic 3-D Scene Reconstruction , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[90]  Jean-Yves Guillemaut,et al.  Moving Camera Registration for Multiple Camera Setups in Dynamic Scenes , 2010, BMVC.

[91]  Takeo Kanade,et al.  Visual hull alignment and refinement across time: a 3D reconstruction algorithm combining shape-from-silhouette with stereo , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[92]  Adrian Hilton,et al.  Evaluation of 3D Feature Descriptors for Multi-modal Data Registration , 2013, 2013 International Conference on 3D Vision.

[93]  Tobias Alexander Franke,et al.  Using images and explicit binary container for efficient and incremental delivery of declarative 3D scenes on the web , 2012, Web3D '12.

[94]  Shree K. Nayar,et al.  Catadioptric omnidirectional camera , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[95]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[96]  Jean Ponce,et al.  Automatic annotation of human actions in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[97]  Alun Evans,et al.  WebGLStudio: a pipeline for WebGL scene creation , 2013, Web3D '13.

[98]  Philipp Slusallek,et al.  XML3D: interactive 3D graphics for the web , 2010, Web3D '10.

[99]  Ameet Talwalkar,et al.  On the Impact of Kernel Approximation on Learning Accuracy , 2010, AISTATS.

[100]  Kevin P. Murphy,et al.  Modeling changing dependency structure in multivariate time series , 2007, ICML '07.

[101]  Steven M. Seitz,et al.  Multicore bundle adjustment , 2011, CVPR 2011.

[102]  Bartosz Krawczyk,et al.  Handling Label Noise in Microarray Classification with One-Class Classifier Ensemble , 2014, ICT Innovations.

[103]  Dimitris Achlioptas,et al.  Fast computation of low rank matrix approximations , 2001, STOC '01.

[104]  Thinh Nguyen,et al.  Optimal Visual Sensor Network Configuration , 2009, Multi-Camera Networks.

[105]  Frank Dellaert,et al.  Square Root SAM: Simultaneous Localization and Mapping via Square Root Information Smoothing , 2006, Int. J. Robotics Res..

[106]  Cordelia Schmid,et al.  Dense Trajectories and Motion Boundary Descriptors for Action Recognition , 2013, International Journal of Computer Vision.

[107]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[108]  Vladimir Pavlovic,et al.  Learning Switching Linear Models of Human Motion , 2000, NIPS.

[109]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[110]  Francisco de A. T. de Carvalho,et al.  Kernel-based hard clustering methods in the feature space with automatic variable weighting , 2014, Pattern Recognit..

[111]  Rong Jin,et al.  Approximate kernel k-means: solution to large scale kernel clustering , 2011, KDD.

[112]  Frank Dellaert,et al.  Covariance recovery from a square root information matrix for data association , 2009, Robotics Auton. Syst..

[113]  Joan Serrat,et al.  Video Alignment for Change Detection , 2011, IEEE Transactions on Image Processing.

[114]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[115]  Michael Zöllner,et al.  X3DOM: a DOM-based HTML5/X3D integration model , 2009, Web3D '09.

[116]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[117]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[118]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[119]  Johan A. K. Suykens,et al.  Optimized Data Fusion for Kernel k-Means Clustering , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[120]  Jessica K. Hodgins,et al.  Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[121]  Alun Evans,et al.  Hybrid visualisation of digital production big data , 2015, Web3D.

[122]  Adrian Hilton,et al.  The Multiple-Camera 3-D Production Studio , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[123]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[124]  Frank P. Ferrie,et al.  Automatic registration of mobile LiDAR and spherical panoramas , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[125]  Manolis I. A. Lourakis,et al.  SBA: A software package for generic sparse bundle adjustment , 2009, TOMS.

[126]  Alexandros Iosifidis,et al.  Multi-view human movement recognition based on fuzzy distances and linear discriminant analysis , 2012, Comput. Vis. Image Underst..

[127]  Xiang Chen,et al.  Modeling Coverage in Camera Networks: A Survey , 2012, International Journal of Computer Vision.

[128]  R. Dinesh,et al.  Non-parametric adaptive region of support useful for corner detection: a novel approach , 2004, Pattern Recognit..

[129]  Adrian Hilton,et al.  Coverage evaluation of camera networks for facilitating big-data management in film production , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[130]  Ingo Steinwart,et al.  Consistency of support vector machines and other regularized kernel classifiers , 2005, IEEE Transactions on Information Theory.

[131]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .