Online Learning for Multimodal Data Fusion With Application to Object Recognition

We consider online multimodal data fusion, where the goal is to combine information from multiple modes to identify an element in a large dictionary. We address this problem in the context of object recognition by focusing on tactile sensing as one of the modes. Using a tactile glove with seven sensors, various individuals grasp different objects to obtain 7-D time series, where each component represents the pressure sequence applied to one sensor. The pressure data of all objects is stored in a dictionary as a reference. The objective is to match a streaming vector time series from grasping an unknown object to a dictionary object. We propose an algorithm that may start with prior knowledge provided by other modes. Receiving pressure data sequentially, the algorithm uses a dissimilarity metric to modify the prior and form a probability distribution over the dictionary. When the dictionary objects are dissimilar in shape, we empirically show that our algorithm recognize the unknown object even with a uniform prior. If there exists a similar object to the unknown object in the dictionary, our algorithm needs the prior from other modes to detect the unknown object. Notably, our algorithm maintains a similar performance to standard offline classification techniques, such as support vector machine, with a significantly lower computational time.

[1]  Wolfram Burgard,et al.  Multimodal deep learning for robust RGB-D object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[3]  Anna Choromanska,et al.  Online Clustering with Experts , 2012, AISTATS.

[4]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[5]  Takamitsu Matsubara,et al.  Object manifold learning with action features for active tactile object recognition , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Antonio Torralba,et al.  Sharing Visual Features for Multiclass and Multiview Object Detection , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Gregory D. Hager,et al.  Tactile-Object Recognition From Appearance Information , 2011, IEEE Transactions on Robotics.

[8]  Oussama Khatib,et al.  Bayesian estimation for autonomous object manipulation based on tactile sensors , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[9]  R L Klatzky,et al.  Identifying objects by touch: An “expert system” , 1985, Perception & psychophysics.

[10]  Rong Jin,et al.  25th Annual Conference on Learning Theory Online Optimization with Gradual Variations , 2022 .

[11]  Aude Billard,et al.  A survey of Tactile Human-Robot Interactions , 2010, Robotics Auton. Syst..

[12]  Robert D. Howe,et al.  Tactile sensing and control of robotic manipulation , 1993, Adv. Robotics.

[13]  Fuchun Sun,et al.  Object recognition using tactile and image information , 2015, 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[14]  Andrew Y. Ng,et al.  Integrating Visual and Range Data for Robotic Object Detection , 2008, ECCV 2008.

[15]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[16]  Christian Jutten,et al.  Multimodal Data Fusion: An Overview of Methods, Challenges, and Prospects , 2015, Proceedings of the IEEE.

[17]  Giulio Sandini,et al.  Tactile Sensing—From Humans to Humanoids , 2010, IEEE Transactions on Robotics.

[18]  Kaspar Althoefer,et al.  Tactile sensing for dexterous in-hand manipulation in robotics-A review , 2011 .

[19]  Di Guo,et al.  Extreme Kernel Sparse Learning for Tactile Object Recognition , 2017, IEEE Transactions on Cybernetics.

[20]  Vladimir Vovk,et al.  Prediction with expert advice for the Brier game , 2007, ICML '08.

[21]  Wolfram Burgard,et al.  Object identification with tactile sensors using bag-of-features , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Chong-sun Kim Canonical Analysis of Several Sets of Variables , 1973 .

[23]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Chin-Teng Lin,et al.  A Recurrent Fuzzy Coupled Cellular Neural Network System With Automatic Structure and Template Learning , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[26]  Francesc Moreno-Noguer,et al.  Random clustering ferns for multimodal object recognition , 2017, Neural Computing and Applications.

[27]  Nathan F. Lepora,et al.  Active touch for robust perception under position uncertainty , 2013, 2013 IEEE International Conference on Robotics and Automation.

[28]  Heinz Wörn,et al.  Haptic object recognition using passive joints and haptic key features , 2010, 2010 IEEE International Conference on Robotics and Automation.

[29]  Quan Pan,et al.  Estimation of the Projection Operator in a Multiresolution Multisensor Data Fusion Scheme , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[30]  Youshen Xia,et al.  A prediction fusion method for reconstructing spatial temporal dynamics using support vector machines , 2006, IEEE Transactions on Circuits and Systems II: Express Briefs.

[31]  Karthik Sridharan,et al.  Optimization, Learning, and Games with Predictable Sequences , 2013, NIPS.