Revisiting active perception

Despite the recent successes in robotics, artificial intelligence and computer vision, a complete artificial agent necessarily must include active perception. A multitude of ideas and methods for how to accomplish this have already appeared in the past, their broader utility perhaps impeded by insufficient computational power or costly hardware. The history of these ideas, perhaps selective due to our perspectives, is presented with the goal of organizing the past literature and highlighting the seminal contributions. We argue that those contributions are as relevant today as they were decades ago and, with the state of modern computational tools, are poised to find new life in the robotic perception systems of the next decade.

[1]  J. Piaget Play, dreams and imitation in childhood , 1951 .

[2]  R. Hetherington The Perception of the Visual World , 1952 .

[3]  Nils J. Nilsson,et al.  A mobius automation: an application of artificial intelligence techniques , 1969, IJCAI 1969.

[4]  Nils J. Nilsson,et al.  A Mobile Automaton: An Application of Artificial Intelligence Techniques , 1969, IJCAI.

[5]  N. Pastore Selective history of theories of visual perception: 1650-1950 , 1975 .

[6]  Jay Martin Tenenbaum,et al.  Accommodation in computer vision , 1971 .

[7]  J. M. Heaton,et al.  Selective History of Theories of Visual Perception 1650–1950, by Nicholas Pastore. , 1973 .

[8]  Akihiko Uchiyama,et al.  Information-Power Machine with Senses and Limbs , 1974 .

[9]  T. Garvey Perceptual strategies for purposive vision , 1975 .

[10]  John K. Tsotsos Knowledge-Base Driven Analysis of Cinecardioangiograms , 1977, IJCAI.

[11]  Allen R. Hanson,et al.  Model-Building in the Visions System , 1977, IJCAI.

[12]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[13]  John K. Tsotsos,et al.  ALVEN: A Study on Motion Understanding by Computer , 1979, IJCAI.

[14]  John K. Tsotsos,et al.  A framework for visual motion understanding , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[16]  Hans P. Moravec Obstacle avoidance and navigation in the real world by a seeing robot rover , 1980 .

[17]  Giulio Sandini,et al.  An anthropomorphic retina-like structure for scene analysis , 1980 .

[18]  Ruzena Bajcsy,et al.  Active touch and robot perception , 1984 .

[19]  Ruzena Bajcsy,et al.  Feeling by grasping , 1984, ICRA.

[20]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[21]  Ruzena Bajcsy,et al.  Object Recognition Using Vision and Touch , 1985, IJCAI.

[22]  V. Bruce,et al.  Visual Perception: Physiology, Psychology and Ecology , 1985 .

[23]  Peter J. Burt,et al.  Attention mechanisms for vision in a dynamic world , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[24]  R. Bajcsy Active perception , 1988, Proc. IEEE.

[25]  James J. Clark,et al.  Modal Control Of An Attentive Vision System , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[26]  A. Meltzoff,et al.  Imitation in Newborn Infants: Exploring the Range of Gestures Imitated and the Underlying Mechanisms. , 1989, Developmental psychology.

[27]  Eric Paul Krotkov,et al.  Active Computer Vision by Cooperative Focus and Stereo , 1989, Springer Series in Perception Engineering.

[28]  John K. Tsotsos The Complexity of Perceptual Search Tasks , 1989, IJCAI.

[29]  V. Bruce,et al.  Visual perception: Physiology, psychology and ecology, 2nd ed. , 1990 .

[30]  Ruzena Bajcsy,et al.  Exploration of Surfaces for Robot Mobility , 1990 .

[31]  Yiannis Aloimonos,et al.  Purposive and qualitative active vision , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[32]  Christopher M. Brown,et al.  Intelligent gaze control in binocular vision , 1990, Proceedings. 5th IEEE International Symposium on Intelligent Control 1990.

[33]  R. Klatzky,et al.  Haptic classification of common objects: Knowledge-driven exploration , 1990, Cognitive Psychology.

[34]  Ruzena Bajcsy,et al.  Segmentation via manipulation , 1991, IEEE Trans. Robotics Autom..

[35]  Dana H. Ballard,et al.  Animate Vision , 1991, Artif. Intell..

[36]  Michael Brady,et al.  Gaze Control for a Two-Eyed Robot Head , 1991, BMVC.

[37]  Christopher M. Brown,et al.  Inverse Kinematics and Gaze Stabilization for the Rochester Robot Head , 1991 .

[38]  Ruzena Bajcsy,et al.  Robotic exploration of surfaces and its application to legged locomotion , 1991, Proceedings 1992 IEEE International Conference on Robotics and Automation.

[39]  James L. Crowley,et al.  Gaze Control for a Binocular Camera Head , 1992, ECCV.

[40]  H. Maturana,et al.  The Tree of Knowledge: The Biological Roots of Human Understanding , 2007 .

[41]  John K. Tsotsos,et al.  Active object recognition , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Jan-Olof Eklundh,et al.  A head-eye system - Analysis and design , 1992, CVGIP Image Underst..

[43]  A. Lynn Abbott,et al.  University of Illinois active vision system , 1992, Other Conferences.

[44]  Ruzena Bajcsy,et al.  Active and exploratory perception , 1992, CVGIP Image Underst..

[45]  Y. Aloimonos Active Perception , 1993 .

[46]  John K. Tsotsos,et al.  Design and Performance of Trish, a Binocular Robot Head with Torsional Eye Movements , 1993, Int. J. Pattern Recognit. Artif. Intell..

[47]  Henrik I. Christensen,et al.  A Low-Cost Robot Camera Head , 1993, Int. J. Pattern Recognit. Artif. Intell..

[48]  Henrik I. Christensen,et al.  Active Robot Vision: Camera Heads, Model Based Navigation and Reactive Control , 1993 .

[49]  Graham A. Parker,et al.  The Surrey Attentive Robot Vision System , 1993, Int. J. Pattern Recognit. Artif. Intell..

[50]  Ruzena Bajcsy,et al.  Discrete Event Systems for autonomous mobile agents , 1994, Robotics Auton. Syst..

[51]  James L. Crowley,et al.  Integration and Control of Reactive Visual Processes , 1994, ECCV.

[52]  Ruzena Bajcsy,et al.  Functionality investigation using a discrete event system approach , 1994, Robotics Auton. Syst..

[53]  Yiming Ye,et al.  Where to look next in 3D object search , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[54]  Ye,et al.  Where to Look Next in 3 D Object SearchYiming , 1995 .

[55]  James L. Crowley,et al.  Integration and control of reactive visual processes , 1994, Robotics Auton. Syst..

[56]  Ruzena Bajcsy,et al.  Interactive Recognition and Representation of Functionality , 1995, Comput. Vis. Image Underst..

[57]  Demetri Terzopoulos,et al.  Animat vision: Active vision in artificial animals , 1995, Proceedings of IEEE International Conference on Computer Vision.

[58]  Yasuo Kuniyoshi,et al.  A foveated wide angle lens for active vision , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[59]  G. Rizzolatti,et al.  Action recognition in the premotor cortex. , 1996, Brain : a journal of neurology.

[60]  Sven J. Dickinson,et al.  Active Object Recognition Integrating Attention and Viewpoint Control , 1997, Comput. Vis. Image Underst..

[61]  James A. Hendler,et al.  Languages, behaviors, hybrid architectures, and motion control , 1998 .

[62]  Sven J. Dickinson,et al.  PLAYBOT A visually-guided robot for physically disabled children , 1998, Image Vis. Comput..

[63]  H. Corlett A natural history of vision , 1999, Medical History.

[64]  Yiming Ye,et al.  Sensor Planning for 3D Object Search, , 1999, Comput. Vis. Image Underst..

[65]  Demetri Terzopoulos,et al.  Active Perception in Virtual Humans , 2000 .

[66]  Yiming Ye,et al.  A Complexity‐Level Analysis of the Sensor Planning Task for Object Search , 2001, Comput. Intell..

[67]  Jeffrey Mark Siskind,et al.  Grounding the Lexical Semantics of Verbs in Visual Perception using Force Dynamics and Event Logic , 1999, J. Artif. Intell. Res..

[68]  J. Trinkle,et al.  Dynamic multi-rigid-body systems with concurrent distributed contacts , 2001, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[69]  J. Trinkle,et al.  Dynamic Multi-Rigid-Body Systems with Concurrent Distributed Contacts: Theory and Examples , 2001 .

[70]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[71]  Demetri Terzopoulos Perceptive agents and systems in virtual reality , 2003, VRST '03.

[72]  Eric Krotkov,et al.  Focusing , 2004, International Journal of Computer Vision.

[73]  John K. Tsotsos On the relative complexity of active vs. passive visual search , 2004, International Journal of Computer Vision.

[74]  Christopher M. Brown,et al.  Controlling eye movements with hidden Markov models , 2004, International Journal of Computer Vision.

[75]  Kunihiko Fukushima,et al.  A neural network model for selective attention in visual pattern recognition , 1986, Biological Cybernetics.

[76]  Yiannis Aloimonos,et al.  Active vision , 2004, International Journal of Computer Vision.

[77]  Christopher Brown Prediction and cooperation in gaze control , 1990, Biological Cybernetics.

[78]  Ronald Lumia,et al.  TRICLOPS: A tool for studying active vision , 2005, International Journal of Computer Vision.

[79]  L. Itti,et al.  A brief and selective history of attention , 2005 .

[80]  George J. Pappas,et al.  Hybrid Controllers for Path Planning: A Temporal Logic Approach , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[81]  Jan-Olof Eklundh,et al.  Vision in the real world: Finding, attending and recognizing objects , 2006, Int. J. Imaging Syst. Technol..

[82]  P. Subramanian Active Vision: The Psychology of Looking and Seeing , 2006 .

[83]  John K. Tsotsos,et al.  Attention and Visual Search: Active Robotic Vision Systems that Search , 2007 .

[84]  Dana H. Ballard,et al.  Modeling embodied visual behaviors , 2007, TAP.

[85]  Pieter Abbeel,et al.  Learning for control from multiple demonstrations , 2008, ICML '08.

[86]  Shengyong Chen,et al.  Active Sensor Planning for Multiview Vision Tasks , 2008 .

[87]  David Vernon,et al.  Cognitive Vision: the Case for Embodied Perception , 2005 .

[88]  Stefanie Tellex,et al.  Grounding spatial prepositions for video search , 2009, ICMI-MLMI '09.

[89]  Stefano Soatto,et al.  Actionable information in vision , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[90]  Loong Fah Cheong,et al.  Active segmentation with fixation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[91]  Manuela Chessa,et al.  A Virtual Reality Simulator for Active Stereo Vision Systems , 2009, VISAPP.

[92]  John K. Tsotsos,et al.  A theory of active object localization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[93]  Danica Kragic,et al.  Active 3D scene segmentation and detection of unknown objects , 2010, 2010 IEEE International Conference on Robotics and Automation.

[94]  Giulio Sandini,et al.  Tactile Sensing—From Humans to Humanoids , 2010, IEEE Transactions on Robotics.

[95]  Stefanie Tellex,et al.  Toward understanding natural language directions , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[96]  Pieter Abbeel,et al.  Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding , 2010, 2010 IEEE International Conference on Robotics and Automation.

[97]  Shengyong Chen,et al.  Active vision in robotic systems: A survey of recent developments , 2011, Int. J. Robotics Res..

[98]  Eren Erdal Aksoy,et al.  Learning the semantics of object–action relations by observation , 2011, Int. J. Robotics Res..

[99]  Xiaodong Yu,et al.  Active scene recognition with vision and language , 2011, 2011 International Conference on Computer Vision.

[100]  John K. Tsotsos A Computational Perspective on Visual Attention , 2011 .

[101]  Heiko Wersing,et al.  Active 3D Object Localization Using a Humanoid Robot , 2011, IEEE Transactions on Robotics.

[102]  Yiannis Aloimonos,et al.  Segmenting “simple” objects using RGB-D , 2012, 2012 IEEE International Conference on Robotics and Automation.

[103]  Gabor Karsai,et al.  Toward a Science of Cyber–Physical System Integration , 2012, Proceedings of the IEEE.

[104]  Yiannis Aloimonos,et al.  Towards a Watson that sees: Language-guided action recognition for robots , 2012, 2012 IEEE International Conference on Robotics and Automation.

[105]  H. Barrow,et al.  Relational Descriptions in Picture Processing , 2012 .

[106]  Ronen Basri,et al.  Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[107]  Douglas Summers-Stay,et al.  Using a minimal action grammar for activity understanding in the real world , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[108]  John K. Tsotsos,et al.  A Computational Learning Theory of Active Object Recognition Under Uncertainty , 2012, International Journal of Computer Vision.

[109]  Loong Fah Cheong,et al.  Active Visual Segmentation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[110]  Gabor Karsai,et al.  Toward a Science of Cyber-Physical System , 2012 .

[111]  Yiannis Aloimonos,et al.  The minimalist grammar of action , 2012, Philosophical Transactions of the Royal Society B: Biological Sciences.

[112]  M. Rucci,et al.  Active Vision: Adapting How to Look , 2013, Current Biology.

[113]  Paul H. J. Kelly,et al.  SLAM++: Simultaneous Localisation and Mapping at the Level of Objects , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[114]  Yiannis Aloimonos,et al.  Detection of Manipulation Action Consequences (MAC) , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[115]  Neil T. Dantam,et al.  The Motion Grammar: Analysis of a Linguistic Method for Robot Control , 2013, IEEE Transactions on Robotics.

[116]  Ali Borji,et al.  State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[117]  Ales Ude,et al.  A Simple Ontology of Manipulation Actions Based on Hand-Object Relations , 2013, IEEE Transactions on Autonomous Mental Development.

[118]  John K. Tsotsos,et al.  50 Years of object recognition: Directions forward , 2013, Comput. Vis. Image Underst..

[119]  Yiannis Aloimonos,et al.  Embedding high-level information into low level vision: Efficient object search in clutter , 2013, 2013 IEEE International Conference on Robotics and Automation.

[120]  John K. Tsotsos,et al.  Visual Saliency Improves Autonomous Visual Search , 2014, 2014 Canadian Conference on Computer and Robot Vision.

[121]  Alan Yuille,et al.  Active Vision , 2014, Computer Vision, A Reference Guide.

[122]  Wouter M. Bergmann Tiest,et al.  Shape from Touch , 2014 .

[123]  智一 吉田,et al.  Efficient Graph-Based Image Segmentationを用いた圃場図自動作成手法の検討 , 2014 .

[124]  Astrid M. L. Kappers,et al.  Shape from touch , 2014, Scholarpedia.

[125]  Yiannis Aloimonos,et al.  A Cognitive System for Understanding Human Manipulation Actions , 2014 .

[126]  John K. Tsotsos,et al.  On computational modeling of visual saliency: Examining what’s right, and what’s left , 2015, Vision Research.

[127]  Michael Beetz,et al.  Perception for Everyday Human Robot Interaction , 2015, KI - Künstliche Intelligenz.

[128]  S. Shankar Sastry,et al.  Personalized kinematics for human-robot collaborative manipulation , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[129]  Yiannis Aloimonos,et al.  A Gestaltist approach to contour-based object recognition: Combining bottom-up and top-down cues , 2015, Int. J. Robotics Res..

[130]  Eren Erdal Aksoy,et al.  Learning the Semantics of Manipulation Action , 2015, ACL.

[131]  Yiannis Aloimonos,et al.  Learning the spatial semantics of manipulation actions through preposition grounding , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[132]  John K. Tsotsos,et al.  Towards the Quantitative Evaluation of Visual Attention Models Bottom−up Top-down Dynamic Static 0 0 0 , 2022 .

[133]  Yi Li,et al.  Robot Learning Manipulation Action Plans by "Watching" Unconstrained Videos from the World Wide Web , 2015, AAAI.

[134]  Rajeev Alur,et al.  Principles of Cyber-Physical Systems , 2015 .

[135]  Emanuel Todorov,et al.  Physically consistent state estimation and system identification for contacts , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[136]  Yiannis Aloimonos,et al.  Cluttered scene segmentation using the symmetry constraint , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).