Integrated Concept of Human Motions and Objects based on Multi-layered Multimodal LDA

近年,知能ロボットの研究が盛んに進められている.そのよ うな知能ロボットの要素技術として,物体のカテゴリ分類や認 識があり,未知の環境でロボットが柔軟に動作するためにも物 体のカテゴリが認識できることは重要である.現在まで,物体 から取得可能な特徴量を用いた物体のカテゴリ分類・認識に関 する研究が数多くなされている [1]~[6]. 筆者らもこれまで,pLSA(probabilistic Latent Semantic Analysis)や LDA(Latent Dirichlet Allocation)を拡張した マルチモーダルカテゴリゼーションを提案し,複数のモダリティ を用いることにより,より人間の感覚に近い物体カテゴリをロ ボットが教師なしで学習できることを示した [7] [8].ここで重要 なのは,学習された物体カテゴリを基盤とした未観測情報の予 測であり,これがロボットによる理解につながる [9].また,こ うした物体カテゴリが教師なしで学習されることが重要であり,

[1]  Larry S. Davis,et al.  Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, CVPR.

[3]  Tetsuya Ogata,et al.  Inter-modality mapping in robot with recurrent neural network , 2010, Pattern Recognit. Lett..

[4]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Luc De Raedt,et al.  Learning relational affordance models for robots in multi-object manipulation tasks , 2012, 2012 IEEE International Conference on Robotics and Automation.

[6]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[7]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Tadahiro Taniguchi,et al.  Double articulation analyzer for unsegmented human motion using Pitman-Yor language model and infinite hidden Markov model , 2011, 2011 IEEE/SICE International Symposium on System Integration (SII).

[9]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[10]  Alexei A. Efros,et al.  Discovering object categories in image collections , 2005 .

[11]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[12]  Yoshihiko Nakamura,et al.  Prediction of human behaviors in the future through symbolic inference , 2011, 2011 IEEE International Conference on Robotics and Automation.

[13]  Fei-Fei Li,et al.  Recognizing Human-Object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Olivier Mangin,et al.  Learning to recognize parallel combinations of human motion primitives with linguistic descriptions using non-negative matrix factorization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Jivko Sinapov,et al.  Object category recognition by a humanoid robot using behavior-grounded relational learning , 2011, 2011 IEEE International Conference on Robotics and Automation.

[16]  Yoshihiko Nakamura,et al.  Bigram-based natural language model and statistical motion symbol model for scalable language of humanoid robots , 2012, 2012 IEEE International Conference on Robotics and Automation.

[17]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.