High-Throughput Content-Based Video Analysis Technologies

: Under the environment of Big Data, how to analyze the content of high concurrent video data is a scientific problem which requires urgent solution. In this paper, we introduce the technologies about high-throughput content-based video analysis for content-based monitoring of web images and videos. We give an in-tensive survey on the state of the developments and trends in four key technologies: efficient video decoding and feature extraction with mass-core processors, and high-dimensional indexing and semantic recognition on distributed systems. Furthermore, we introduce our latest research works on these technologies: parallel deblocking filter on mass-core processor, extraction and mining of highly robust and parallel local features, high-dimensional distributed indexing, ensemble learning for large scale data, so as to take full advantages of high performances of multi-grain parallel computing platforms for the purpose of providing key technologies for the important applica-tions such as Internet video monitoring and search, etc.

[1]  Yongdong Zhang,et al.  Efficient Parallel Framework for HEVC Motion Estimation on Many-Core Processors , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Yongdong Zhang,et al.  Parallel deblocking filter for HEVC on many-core processor , 2014 .

[3]  Tao Wang,et al.  Deep learning with COTS HPC systems , 2013, ICML.

[4]  Yongdong Zhang,et al.  Efficient Parallel Framework for HEVC Deblocking Filter on Many-Core Platform , 2013, 2013 Data Compression Conference.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Antti Hallapuro,et al.  Comparative Rate-Distortion-Complexity Analysis of HEVC and AVC Video Codecs , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[9]  Yongdong Zhang,et al.  Data Independent Method of Constructing Distributed LSH for Large-Scale Dynamic High-Dimensional Indexing , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.

[10]  Yongdong Zhang,et al.  Efficient Parallel Framework for H.264/AVC Deblocking Filter on Many-Core Platform , 2012, IEEE Transactions on Multimedia.

[11]  Sheng Tang,et al.  Ensemble Learning with LDA Topic Models for Visual Concept Detection , 2012 .

[12]  Sheng Tang,et al.  Sparse Ensemble Learning for Concept Detection , 2012, IEEE Transactions on Multimedia.

[13]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Sheng Tang,et al.  Efficient Feature Detection and Effective Post-Verification for Large Scale Near-Duplicate Image Search , 2011, IEEE Transactions on Multimedia.

[15]  Yongdong Zhang,et al.  Localized Multiple Kernel Learning for Realistic Human Action Recognition in Videos , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Yongdong Zhang,et al.  Parallel deblocking filter for H.264/AVC implemented on Tile64 platform , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[17]  Ben H. H. Juurlink,et al.  A QHD-capable parallel H.264 decoder , 2011, ICS '11.

[18]  Sheng Tang,et al.  Localized Multiple Kernel Learning for Realistic Human Action Recognition in Videos , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Rainer Leupers,et al.  Virtual Manycore platforms: Moving towards 100+ processor cores , 2011, 2011 Design, Automation & Test in Europe.

[20]  Lei Wu,et al.  Compact projection: Simple and efficient near neighbor search with practical memory requirements , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Ben H. H. Juurlink,et al.  Evaluation of parallel H.264 decoding strategies for the Cell Broadband Engine , 2010, ICS '10.

[22]  Chia-Lin Yang,et al.  A Multi-core Architecture Based Parallel Framework for H.264/AVC Deblocking Filters , 2009, J. Signal Process. Syst..

[23]  Sheng Tang,et al.  Pornprobe: an LDA-SVM based pornography detection system , 2009, ACM Multimedia.

[24]  Jean-Michel Morel,et al.  A fully affine invariant image comparison method , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25]  Luc Van Gool,et al.  Fast scale invariant feature detection and matching on programmable graphics hardware , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[26]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[27]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[28]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[29]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[31]  Anil K. Jain,et al.  On image classification: city images vs. landscapes , 1998, Pattern Recognit..

[32]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[33]  David W. Opitz,et al.  Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.

[34]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[35]  Max A. Viergever,et al.  General Intensity Transformations and Second Order Invariants , 1992 .

[36]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[37]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[38]  D. C. Koelma,et al.  TREC Video Retrieval Evaluation : notebook papers and slides , 2013 .

[39]  Bernd Freisleben,et al.  Fast Motion Estimation on Graphics Hardware for H.264 Video Encoding , 2009, IEEE Transactions on Multimedia.

[40]  Oscar C. Au,et al.  Video Coding On Multi-Core Graphics Processors , 2009 .

[41]  Sheng Tang,et al.  TRECVID 2007 High-Level Feature Extraction By MCG-ICT-CAS , 2007, TRECVID.

[42]  Bo Zhang,et al.  Intelligent Multimedia Group of Tsinghua University at TRECVID 2006 , 2006, TRECVID.

[43]  Simo Särkkä,et al.  Advances in Neural Information Processing Systems 25 (NIPS 2012) , 2002 .

[44]  Luc Van Gool,et al.  Wide Baseline Stereo Matching based on Local, Affinely Invariant Regions , 2000, BMVC.

[45]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[46]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[47]  Yohan. jin,et al.  2013 Ieee Conference on Computer Vision and Pattern Recognition Workshops 2013 Ieee Conference on Computer Vision and Pattern Recognition Workshops 2013 Ieee Conference on Computer Vision and Pattern Recognition Workshops 2013 Ieee Conference on Computer Vision and Pattern Recognition Workshops , 2022 .