Distributed compression and fusion of nonnegative sparse signals for multiple-view object recognition

Visual surveillance in complex urban environments requires an intelligent system to automatically track and identify multiple objects of interest in a network of distributed cameras. The ability to perform robust object recognition is critical to compensate adverse conditions and improve performance, such as multi-object association, visual occlusion, and data fusion with hybrid sensor modalities. In this paper, we propose an efficient distributed data compression and fusion scheme to encode and transmit SIFT-based visual histograms in a multi-hop network to perform accurate 3-D object recognition. The method harnesses an emerging theory of (distributed) compressive sensing to encode high-dimensional, nonnegative sparse signals via random projection, which is unsupervised and independent to the sensor modality. A multi-hop protocol then transmits the compressed visual data to a base-station computer, which preserves a constant bandwidth regardless of the number of active camera nodes in the network. Finally, the multiple-view object features are simultaneously recovered via ℓ1-minimization as an efficient decoder. The efficacy of the algorithm is validated using up to four Berkeley CITRIC camera motes deployed in a realistic indoor environment. The substantial computation power on the CITRIC mote also enables fast compression of SIFT-type visual features extracted from object images.

[1]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[2]  Allen Y. Yang,et al.  CITRIC: A low-bandwidth wireless camera network platform , 2008, 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras.

[3]  Dan Roth,et al.  Learning a Sparse Representation for Object Detection , 2002, ECCV.

[4]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Luc Van Gool,et al.  Towards Multi-View Object Class Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  David I. Perrett,et al.  Modeling visual recognition from neurobiological constraints , 1994, Neural Networks.

[8]  Yonina C. Eldar,et al.  Robust Recovery of Signals From a Union of Subspaces , 2008, ArXiv.

[9]  D. Field,et al.  Natural image statistics and efficient coding. , 1996, Network.

[10]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[11]  Bernd Girod,et al.  Tree Histogram Coding for Mobile Image Matching , 2009, 2009 Data Compression Conference.

[12]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[13]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[14]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[15]  Tomaso A. Poggio,et al.  CBF: A New Framework for Object Categorization in Cortex , 2000, Biologically Motivated Computer Vision.

[16]  D. Donoho,et al.  Counting faces of randomly-projected polytopes when the projection radically lowers dimension , 2006, math/0607364.

[17]  Michael Elad,et al.  From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images , 2009, SIAM Rev..

[18]  Mark D. Plumbley Recovery of Sparse Representations by Polytope Faces Pursuit , 2006, ICA.

[19]  R.G. Baraniuk,et al.  Distributed Compressed Sensing of Jointly Sparse Signals , 2005, Conference Record of the Thirty-Ninth Asilomar Conference onSignals, Systems and Computers, 2005..

[20]  Trevor Darrell,et al.  Unsupervised feature selection via distributed coding for multi-view object recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Chuohao Yeo,et al.  Rate-efficient visual correspondences using random projections , 2008, 2008 15th IEEE International Conference on Image Processing.

[22]  Tieyong Zeng,et al.  A Predual Proximal Point Algorithm Solving a Non Negative Basis Pursuit Denoising Model , 2009, International Journal of Computer Vision.

[23]  Luc Van Gool,et al.  Integrating multiple model views for object recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[24]  Chuohao Yeo,et al.  Robust Distributed Multiview Video Compression for Wireless Camera Networks , 2010, IEEE Transactions on Image Processing.

[25]  Chuohao Yeo,et al.  Robust distributed multi-view video compression for wireless camera networks , 2007, Electronic Imaging.