Unsupervised camera localization in crowded spaces

Existing camera networks in public spaces such as train terminals or malls can help social robots to navigate crowded scenes. However, the localization of the cameras is required, i.e., the positions and poses of all cameras in a unique reference. In this work, we estimate the relative location of any pair of cameras by solely using noisy trajectories observed from each camera. We propose a fully unsupervised learning technique using unlabelled pedestrians motion patterns captured in crowded scenes. We first estimate the pairwise camera parameters by optimally matching single-view pedestrian tracks using social awareness. Then, we show the impact of jointly estimating the network parameters. This is done by formulating a nonlinear least square optimization problem, leveraging a continuous approximation of the matching function. We evaluate our approach in real-world environments such as train terminals, where several hundreds of individuals need to be tracked across dozens of cameras every second.

[1]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[2]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[3]  P. Molnár Social Force Model for Pedestrian Dynamics Typeset Using Revt E X 1 , 1995 .

[4]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[5]  José A. Castellanos,et al.  Mobile Robot Localization and Map Building , 1999 .

[6]  Mubarak Shah,et al.  Tracking across multiple cameras with disjoint views , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Yinyu Ye,et al.  Semidefinite programming for ad hoc wireless sensor network localization , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.

[8]  Xiang Ji,et al.  Sensor positioning in wireless ad-hoc sensor networks using multidimensional scaling , 2004, IEEE INFOCOM 2004.

[9]  Michel Bierlaire,et al.  Discrete choice models of pedestrian behavior , 2004 .

[10]  Dimitrios Makris,et al.  Bridging the gaps between cameras , 2004, CVPR 2004.

[11]  David J. Fleet,et al.  Learning Sensor Network Topology through Monte Carlo Expectation Maximization , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[12]  W. Eric L. Grimson,et al.  Inference of non-overlapping camera network topology by measuring statistical dependence , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[13]  Frank Dellaert,et al.  MCMC-based particle filtering for tracking a variable number of interacting targets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Juan D. Tardós,et al.  Hierarchical SLAM: real-time accurate mapping of large environments , 2005, IEEE Transactions on Robotics.

[15]  W. Eric L. Grimson,et al.  Recovering Non-overlapping Network Topology Using Far-field Vehicle Tracking Data , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[16]  A. G. Amitha Perera,et al.  Multi-Object Tracking Through Simultaneous Long Occlusions and Split-Merge Conditions , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Sebastian Thrun,et al.  The Graph SLAM Algorithm with Applications to Large-Scale Mapping of Urban Structures , 2006, Int. J. Robotics Res..

[18]  Stefan Carlsson,et al.  Multi-Target Tracking - Linking Identities using Bayesian Network Inference , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Gérard G. Medioni,et al.  Multiple Target Tracking Using Spatio-Temporal Markov Chain Monte Carlo Data Association , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Paul Tseng,et al.  Second-Order Cone Programming Relaxation of Sensor Network Localization , 2007, SIAM J. Optim..

[21]  Pascal Fua,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Alexandre Alahi,et al.  Combination of Fixed and Mobile Cameras for Automatic Pedestrian Detection , 2008 .

[23]  Yael Moses,et al.  Homography based multiple camera detection and tracking of people in a dense crowd , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Christophe De Vleeschouwer,et al.  Detection and recognition of sports(wo)men from multiple views , 2009, 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC).

[25]  S. Shankar Sastry,et al.  Algebraic approach to recovering topological information in distributed camera networks , 2009, 2009 International Conference on Information Processing in Sensor Networks.

[26]  Amit K. Roy-Chowdhury,et al.  Continuous Learning of a Multilayered Network Topology in a Video Camera Network , 2009, EURASIP J. Image Video Process..

[27]  Mubarak Shah,et al.  Tracking Multiple Occluding People by Localizing on Multiple Scene Planes , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Robert T. Collins,et al.  Marked point processes for crowd counting , 2009, CVPR.

[29]  Luc Van Gool,et al.  You'll never walk alone: Modeling social behavior for multi-target tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  René Vidal,et al.  Distributed calibration of Camera Sensor Networks , 2009, 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC).

[31]  Kai Oliver Arras,et al.  People tracking with human motion predictions from social forces , 2010, 2010 IEEE International Conference on Robotics and Automation.

[32]  Dieter Fox,et al.  RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments , 2010, ISER.

[33]  Dinesh Manocha,et al.  Smooth and collision-free navigation for multiple robots under differential-drive constraints , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34]  Mike Hazas,et al.  A comparison of MDS-MAP and non-linear regression , 2010, 2010 International Conference on Indoor Positioning and Indoor Navigation.

[35]  Charless C. Fowlkes,et al.  Globally-optimal greedy algorithms for tracking a variable number of objects , 2011, CVPR 2011.

[36]  Nadeem Anjum,et al.  Camera Localization in Distributed Networks Using Trajectory Estimation , 2011, J. Electr. Comput. Eng..

[37]  Amit K. Roy-Chowdhury,et al.  Distributed Camera Networks , 2011, IEEE Signal Processing Magazine.

[38]  Luis E. Ortiz,et al.  Who are you with and where are you going? , 2011, CVPR 2011.

[39]  Bodo Rosenhahn,et al.  Everybody needs somebody: Modeling social and grouping behavior on a linear programming multiple people tracker , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[40]  Manuela M. Veloso,et al.  Depth camera based indoor mobile robot localization and navigation , 2012, 2012 IEEE International Conference on Robotics and Automation.

[41]  Sebastian Thrun,et al.  Unsupervised extrinsic calibration of depth sensors in dynamic scenes , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[43]  Fei-Fei Li,et al.  Socially-Aware Large-Scale Crowd Forecasting , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).