Learning Online Data Association

When an agent interacts with a complex environment, it receives a stream of percepts in which it may detect entities, such as objects or people. To build up a coherent, low-variance estimate of the underlying state, it is necessary to fuse information from multiple detections over time. To do this fusion, the agent must decide which detections to associate with one another. We address this data-association problem in the setting of an online filter, in which each observation is processed by aggregating into an existing object hypothesis. Classic methods with strong probabilistic foundations exist, but they are computationally expensive and require models that can be difficult to acquire. In this work, we use the deep-learning tools of sparse attention and representation learning to learn a machine that processes a stream of detections and outputs a set of hypotheses about objects in the world. We evaluate this approach on simple clustering problems, problems with dynamics, and a complex image-based domain. We find that it generalizes well from short to long observation sequences and from a few to many hypotheses, outperforming other learning approaches and classical non-learning methods.

[1]  Liam Paninski,et al.  Neural Clustering Processes , 2020, ICML.

[2]  Hui Zhang,et al.  DeepDA: LSTM-based Deep Data Association Network for Multi-Targets Tracking in Clutter , 2019, 2019 22th International Conference on Information Fusion (FUSION).

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Razvan Pascanu,et al.  Relational recurrent neural networks , 2018, NeurIPS.

[5]  Konrad Schindler,et al.  Online Multi-Target Tracking Using Recurrent Neural Networks , 2016, AAAI.

[6]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[7]  Yee Whye Teh,et al.  Set Transformer , 2018, ICML.

[8]  Anna Choromanska,et al.  Online Clustering with Experts , 2012, AISTATS.

[9]  Silvio Savarese,et al.  Learning to Track: Online Multi-object Tracking by Decision Making , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  J. L. Roux An Introduction to the Kalman Filter , 2003 .

[11]  Y. Bar-Shalom,et al.  The probabilistic data association filter , 2009, IEEE Control Systems.

[12]  Rajeev Motwani,et al.  Incremental clustering and dynamic information retrieval , 1997, STOC '97.

[13]  Qiang Liu,et al.  A Survey of Clustering With Deep Learning: From the Perspective of Network Architecture , 2018, IEEE Access.

[14]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[15]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[16]  Oliver Brock,et al.  Differentiable Particle Filters: End-to-End Learning with Algorithmic Priors , 2018, Robotics: Science and Systems.

[17]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[18]  Maxim Sviridenko,et al.  An Algorithm for Online K-Means Clustering , 2014, ALENEX.

[19]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[20]  Robert C. Bolles,et al.  Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching , 1977, IJCAI.

[21]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[22]  Wenhan Luo,et al.  Multiple object tracking: A literature review , 2014, Artif. Intell..