Understanding expressive action

We strain our eyes, cramp our necks, and destroy our hands trying to interact with computer on their terms. At the extreme, we strap on devices and weigh ourselves down with cables trying to re-create a sense of place inside the machine, while cutting ourselves off from the world and people around us. The alternative is to make the real environment responsive to our actions. It is not enough for environments to respond simply to the presence of people or objects: they must also be aware of the subtleties of changing situations. If all the spaces we inhabit are to be responsive, they must not require encumbering devices to be worn and they must be adaptive to changes in the environment and changes of context. This dissertation examines a body of sophisticated perceptual mechanisms developed in response to these needs as well as a selection of human-computer interface sketches designed to push the technology forward and explore the possibilities of this novel interface idiom. Specifically, the formulation of a fully recursive framework for computer vision called Dyna that improves performance of human motion tracking will be examined in depth. The improvement in tracking performance is accomplished with the combination of a three-dimensional, physics-based model of the human body with modifications to the pixel classification algorithms that enable them to take advantage of this high-level knowledge. The result is a novel vision framework that has no completely bottom-up processes, and is therefore significantly faster and more stable than other approaches. Thesis Supervisor: Alex P. Pentland Title: Academic Head, Program in Media Arts & Sciences Professor of Media Arts & Sciences

[1]  Andrew Blake,et al.  Tracking through singularities and discontinuities by random sampling , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[2]  Dimitris N. Metaxas,et al.  Shape and Nonrigid Motion Estimation Through Physics-Based Synthesis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Andrew Blake,et al.  A Probabilistic Exclusion Principle for Tracking Multiple Objects , 2000, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[4]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[5]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  E. Bizzi,et al.  Postural force fields of the human arm and their role in generating multijoint movements , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[7]  Christoph Bregler,et al.  Learning and recognizing human dynamics in video sequences , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Pietro Perona,et al.  Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[9]  Alex Pentland,et al.  Unsupervised clustering of ambulatory audio and video , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[10]  David Alan Becker,et al.  Sensei, a real-time recognition, feedback and training system for T'ai chi gestures , 1997 .

[11]  Larry S. Davis,et al.  Real-time 3D Motion Capture , 1998 .

[12]  Larry S. Davis,et al.  Towards 3-D model-based tracking and recognition of human movement: a multi-view approach , 1995 .

[13]  Ioannis A. Kakadiaris,et al.  Vision-based animation of digital humans , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[14]  A. Singer,et al.  Detection and Estimation of , 1999 .

[15]  Alex Pentland,et al.  Invariant features for 3-D gesture recognition , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[16]  Takeo Kanade,et al.  Visual Tracking of High DOF Articulated Structures: an Application to Human Hand Tracking , 1994, ECCV.

[17]  Michael Gleicher,et al.  Interactive dynamics , 1990, I3D '90.

[18]  Ioannis A. Kakadiaris,et al.  Active part-decomposition, shape and motion estimation of articulated objects: a physics-based approach , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Alex Pentland,et al.  Recovery of Nonrigid Motion and Structure , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  P. Hughes Spacecraft Attitude Dynamics , 1986 .

[21]  J. O'Rourke,et al.  Model-based image analysis of human motion using constraint propagation , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Michael Isard,et al.  A mixed-state condensation tracker with automatic model-switching , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[23]  Olivier D. Faugeras,et al.  3D articulated models and multi-view tracking with silhouettes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[24]  Ian D. Reid,et al.  A plane measuring device , 1999, Image Vis. Comput..

[25]  David C. Hogg,et al.  An efficient method for contour tracking using active shape models , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[26]  Alex Pentland,et al.  Device synchronization using an optimal linear filter , 1992, I3D '92.

[27]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  C. Bregler,et al.  Video Motion Capture , 1997 .

[29]  Ali J Azarbayejani,et al.  Nonlinear probabilistic estimation of 3-D geometry from images , 1997 .

[30]  Bruce Blumberg,et al.  Action-selection in hamsterdam: lessons from ethology , 1994 .

[31]  Alex Pentland,et al.  Synchronization in Virtual Realities , 1992, Presence: Teleoperators & Virtual Environments.

[32]  James M. Rehg,et al.  A multiple hypothesis approach to figure tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[33]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[34]  Alex Pentland,et al.  Device Synchronization Using an Optimal Linear Filter , 1993, Virtual Reality Systems.

[35]  Alex Pentland,et al.  Real-time self-calibrating stereo person tracking using 3-D shape estimation from blob features , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[36]  Alan S. Willsky Detection of abrupt changes in dynamic systems , 1985 .

[37]  M. Athans,et al.  Adaptive Estimation and Parameter Identification Using Multiple Model Estimation Algorithm , 1976 .

[38]  Henry Stark,et al.  Probability, Random Processes, and Estimation Theory for Engineers , 1995 .

[39]  Alex Pentland,et al.  Multimodal Adaptive Interfaces , 1998 .

[40]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[41]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[42]  Roy Featherstone Coordinate Systems and Efficiency , 1987 .

[43]  Ernst D. Dickmanns,et al.  Recursive 3-D Road and Relative Ego-State Recognition , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Vladimir Pavlovic,et al.  A Dynamic Bayesian Network Approach to Tracking Using Learned Switching Dynamic Models , 2000, HSCC.

[45]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[46]  Alex Pentland,et al.  Dynamic models of human motion , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[47]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[48]  Vladimir Pavlovic,et al.  A dynamic Bayesian network approach to figure tracking using learned dynamic models , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[49]  A. Dale Magoun,et al.  Decision, estimation and classification , 1989 .

[50]  Marcus J. Huber,et al.  Multiple roles, multiple teams, dynamic environment: autonomous Netrek agents , 1997, AGENTS '97.

[51]  William H. Press,et al.  Numerical recipes in C++: the art of scientific computing, 2nd Edition (C++ ed., print. is corrected to software version 2.10) , 1994 .