论文信息 - Distributed compression and fusion of nonnegative sparse signals for multiple-view object recognition

Distributed compression and fusion of nonnegative sparse signals for multiple-view object recognition

Visual surveillance in complex urban environments requires an intelligent system to automatically track and identify multiple objects of interest in a network of distributed cameras. The ability to perform robust object recognition is critical to compensate adverse conditions and improve performance, such as multi-object association, visual occlusion, and data fusion with hybrid sensor modalities. In this paper, we propose an efficient distributed data compression and fusion scheme to encode and transmit SIFT-based visual histograms in a multi-hop network to perform accurate 3-D object recognition. The method harnesses an emerging theory of (distributed) compressive sensing to encode high-dimensional, nonnegative sparse signals via random projection, which is unsupervised and independent to the sensor modality. A multi-hop protocol then transmits the compressed visual data to a base-station computer, which preserves a constant bandwidth regardless of the number of active camera nodes in the network. Finally, the multiple-view object features are simultaneously recovered via ℓ1-minimization as an efficient decoder. The efficacy of the algorithm is validated using up to four Berkeley CITRIC camera motes deployed in a realistic indoor environment. The substantial computation power on the CITRIC mote also enables fast compression of SIFT-type visual features extracted from object images.

[1] Cordelia Schmid,et al. A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[2] Allen Y. Yang,et al. CITRIC: A low-bandwidth wireless camera network platform , 2008, 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras.

[3] Dan Roth,et al. Learning a Sparse Representation for Object Detection , 2002, ECCV.

[4] Bernt Schiele,et al. Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5] Luc Van Gool,et al. Towards Multi-View Object Class Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7] David I. Perrett,et al. Modeling visual recognition from neurobiological constraints , 1994, Neural Networks.

[8] Yonina C. Eldar,et al. Robust Recovery of Signals From a Union of Subspaces , 2008, ArXiv.

[9] D. Field,et al. Natural image statistics and efficient coding. , 1996, Network.

[10] Christopher Hunt,et al. Notes on the OpenSURF Library , 2009 .

[11] Bernd Girod,et al. Tree Histogram Coding for Mobile Image Matching , 2009, 2009 Data Compression Conference.

[12] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[13] Sameer A. Nene,et al. Columbia Object Image Library (COIL100) , 1996 .

[14] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[15] Tomaso A. Poggio,et al. CBF: A New Framework for Object Categorization in Cortex , 2000, Biologically Motivated Computer Vision.

[16] D. Donoho,et al. Counting faces of randomly-projected polytopes when the projection radically lowers dimension , 2006, math/0607364.

[17] Michael Elad,et al. From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images , 2009, SIAM Rev..

[18] Mark D. Plumbley. Recovery of Sparse Representations by Polytope Faces Pursuit , 2006, ICA.

[19] R.G. Baraniuk,et al. Distributed Compressed Sensing of Jointly Sparse Signals , 2005, Conference Record of the Thirty-Ninth Asilomar Conference onSignals, Systems and Computers, 2005..

[20] Trevor Darrell,et al. Unsupervised feature selection via distributed coding for multi-view object recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21] Chuohao Yeo,et al. Rate-efficient visual correspondences using random projections , 2008, 2008 15th IEEE International Conference on Image Processing.

[22] Tieyong Zeng,et al. A Predual Proximal Point Algorithm Solving a Non Negative Basis Pursuit Denoising Model , 2009, International Journal of Computer Vision.

[23] Luc Van Gool,et al. Integrating multiple model views for object recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[24] Chuohao Yeo,et al. Robust Distributed Multiview Video Compression for Wireless Camera Networks , 2010, IEEE Transactions on Image Processing.

[25] Chuohao Yeo,et al. Robust distributed multi-view video compression for wireless camera networks , 2007, Electronic Imaging.