论文信息 - A Novel Deep Neural Network that Uses Space-Time Features for Tracking and Recognizing a Moving Object

A Novel Deep Neural Network that Uses Space-Time Features for Tracking and Recognizing a Moving Object

Abstract This work proposes a deep neural net (DNN) that accomplishes the reliable visual recognition of a chosen object captured with a webcam and moving in a 3D space. Autoencoding and substitutional reality are used to train a shallow net until it achieves zero tracking error in a discrete ambient. This trained individual is set to work in a real world closed loop system where images coming from a webcam produce displacement information for a moving region of interest (ROI) inside the own image. This loop gives rise to an emergent tracking behavior which creates a self-maintain flow of compressed space-time data. Next, short term memory elements are set to play a key role by creating new representations in terms of a space-time matrix. The obtained representations are delivery as input to a second shallow network which acts as “recognizer”. A noise balanced learning method is used to fast train the recognizer with real-world images, giving rise to a simple and yet powerful robotic eye, with a slender neural processor that vigorously tracks and recognizes the chosen object. The system has been tested with real images in real time.

[1] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[2] Naotaka Fujii,et al. Substitutional Reality System: A Novel Experimental Platform for Experiencing Alternative Reality , 2012, Scientific Reports.

[3] Steven J. Luck,et al. Visual short term memory , 2007, Scholarpedia.

[4] Yuting Zhang,et al. Improving object detection with deep convolutional networks via Bayesian optimization and structured prediction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Mohammad H. Mahoor,et al. Vision-Based Landing of Light Weight Unmanned Helicopters on a Smart Landing Platform , 2011, J. Intell. Robotic Syst..

[6] E. Averbach,et al. Short-term memory in vision , 1961 .

[7] Oscar Chang,et al. Reliable object recognition by using cooperative neural agents , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[8] Dumitru Erhan,et al. Deep Neural Networks for Object Detection , 2013, NIPS.

[9] Dorian Aur,et al. Can we build a conscious machine? , 2014, ArXiv.

[10] Karen Drukker,et al. A study of the effect of noise injection on the training of artificial neural networks , 2009, 2009 International Joint Conference on Neural Networks.

[11] Miguel A. Olivares-Méndez,et al. 3D pose estimation based on planar object tracking for UAVs control , 2010, 2010 IEEE International Conference on Robotics and Automation.

[12] Oscar Chang. A Bio-Inspired Robot with Visual Perception of Affordances , 2014, ECCV Workshops.

[13] Yann LeCun,et al. What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14] Lorenzo L. Pesce,et al. Noise injection for training artificial neural networks: a comparison with weight decay and early stopping. , 2009, Medical physics.

[15] Nicholas R. Jennings,et al. Intelligent agents: theory and practice , 1995, The Knowledge Engineering Review.