Tracking using a local closed-world assumption : tracking in the football domain

In this work we address the problem of tracking objects in a complex, dynamic scene. The objects are non-rigid and difficult to model geometrically. Their motion is erratic and they change shape rapidly between frames sampled at 30 frames per second. The objects have low spatial resolution, and the video used for tracking was taken with a panning and zooming camera. Finally, the objects are tracked in sequences up to eight seconds long while moving over a complex background. We suggest that conventional tracking methods are unlikely to perform well at tracking small objects in complex environments because they do not use contextual information to drive feature selection. We propose using "closed-world" analysis to incorporate contextual knowledge into low-level tracking. A closed-world is a space-time region of an image where contextual information like the number and type of objects within the region is assumed to be known. Given that knowledge, the region can be analyzed locally using image processing algorithms and "context-specific" features can be selected for tracking. A context-specific feature is one that has been chosen based upon the context to maximize the chance of successful tracking between frames. We test our algorithm in the "football domain." We describe how closed-world analysis and context-specific tracking can be applied to tracking football players and present the details of our implementation. We include tracking results that demonstrate the wide range of tracking situations the algorithm will successfully handle as well as a few examples of where the algorithm fails. Finally, we suggest some improvements and future extensions. Acknowledgements ou Aaron Bobick, my advisor, has made my stay at the Media Lab an enjoyable and academically rewarding experience. His guidance, enthusiasm, understanding, and insight have been invaluable. I am grateful to be working with an individual who genuinely cares about the well-being of his students and who knows that there is much to life beyond the halls of MIT-even if he does think that Emacs is the greatest program ever written! My thesis readers were Harlyn Baker and Ken Haase. Thanks to both, especially Harlyn, who provided helpful comments that have improved the work presented here. Many thanks go to the HLV students of Vismod-Lee Campbell, Nassir Navab, Andy Wilson, and Claudio Pinhanez. I'm grateful for their company, friendship, expertise, and energy. I'd especially like to thank my second-year counterpart, Lee, with whom I've have many interesting discussions over numerous lunches and …

[1]  Akira Tomono,et al.  Pedestrian counting system robust against illumination changes , 1993, Other Conferences.

[2]  Alex Pentland,et al.  Correlation and Interpolation Networks for Real-time Expression Analysis/Synthesis , 1994, NIPS.

[3]  Michael J. Swain,et al.  Task and Environment-Sensitive Tracking, , 1994 .

[4]  Robert C. Bolles,et al.  Generalizing Epipolar-Plane Image Analysis on the spatiotemporal surface , 2004, International Journal of Computer Vision.

[5]  Daniel P. Huttenlocher,et al.  Tracking non-rigid objects in complex scenes , 1993, 1993 (4th) International Conference on Computer Vision.

[6]  Hans-Hellmut Nagel,et al.  From image sequences towards conceptual descriptions , 1988, Image Vis. Comput..

[7]  Geoffrey D. Sullivan,et al.  Pose refinement of active models using forces in 3D , 1994, ECCV.

[8]  Charles R. Dyer,et al.  Computing spatiotemporal surface flow , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[9]  Kazuyoshi Yoshino,et al.  Qualitative image analysis of group behaviour , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Jitendra Malik,et al.  Robust Multiple Car Tracking with Occlusion Reasoning , 1994, ECCV.

[11]  Geoffrey D. Sullivan,et al.  Kalman Filters in Constrained Model Based Tracking , 1991, BMVC.

[12]  Martin Bichsel,et al.  Segmenting Simply Connected Moving Objects in a Static Scene , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  G. Gordon,et al.  On the tracking of featureless objects with occlusion , 1989, [1989] Proceedings. Workshop on Visual Motion.

[14]  Stuart C. Shapiro,et al.  Encyclopedia of artificial intelligence, vols. 1 and 2 (2nd ed.) , 1992 .

[15]  Hilary Buxton,et al.  Spatio-temporal Reasoning within a Traffic Surveillance System , 1992, ECCV.

[16]  Gregg Clinton Collins,et al.  Plan creation: using strategies as blueprints , 1987 .

[17]  Rachid Deriche,et al.  Tracking line segments , 1990, Image Vis. Comput..

[18]  Geoffrey D. Sullivan,et al.  Model-Based Tracking , 2011, BMVC.

[19]  Rachid Deriche,et al.  Using Canny's criteria to derive a recursively implemented optimal edge detector , 1987, International Journal of Computer Vision.

[20]  Josef Bigün,et al.  Segmentation of moving objects by robust motion parameter estimation over multiple frames , 1994, ECCV.

[21]  Hans-Hellmut Nagel,et al.  Association of Motion Verbs with Vehicle Movements Extracted from Dense Optical Flow Fields , 1994, ECCV.

[22]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Thomas M. Strat,et al.  Context-Based Vision: Recognizing Objects Using Information from Both 2D and 3D Imagery , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  K. Ishii,et al.  Automatic vehicle image extraction based on spatio-temporal image analysis , 1992, [1992] Proceedings. 11th IAPR International Conference on Pattern Recognition.

[25]  David C. Hogg,et al.  Learning Flexible Models from Image Sequences , 1994, ECCV.

[26]  James M. Rehg,et al.  Visual tracking with deformation models , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[27]  Ishwar K. Sethi,et al.  Finding Trajectories of Feature Points in a Monocular Image Sequence , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Geoffrey D. Sullivan,et al.  Kalman Filters in Constrained Model Based Tracking , 1991 .

[29]  Steven D. Blostein,et al.  Detecting small, moving objects in image sequences using sequential hypothesis testing , 1991, IEEE Trans. Signal Process..

[30]  Retz-Schmidt Gudula Recognizing intentions, interactions, and causes of plan failures , 1991 .

[31]  Chung-Lin Huang,et al.  Dynamic scene analysis using path and shape coherence , 1992, Pattern Recognit..

[32]  Timothy F. Cootes,et al.  Training Models of Shape from Sets of Examples , 1992, BMVC.

[33]  Ramin Zabih,et al.  An Algorithm for Real-Time Tracking of Non-Rigid Objects , 1991, AAAI.

[34]  J. Sklansky,et al.  Segmentation of people in motion , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[35]  Tieniu Tan,et al.  Pose Determination and Recognition of Vehicles in Traffic Scenes , 1994, ECCV.

[36]  Gerard Giraudon Chainage efficace de contour , 1987 .

[37]  Hans-Hellmut Nagel,et al.  Model-Based Object Tracking in Traffic Scenes , 1992, ECCV.

[38]  Paul E. Allen,et al.  Some Approaches to Finding Birds in Video Imagery , 1992 .

[39]  V. S. Hwang,et al.  Tracking feature points in time-varying images using an opportunistic selection approach , 1989, Pattern Recognit..

[40]  Paul Tagliabue,et al.  Official playing rules of the National Football League , 1993 .

[41]  Andrew Blake,et al.  Affine-invariant contour tracking with automatic control of spatiotemporal scale , 1993, 1993 (4th) International Conference on Computer Vision.

[42]  Daniel D. Fu,et al.  Vision and navigation in man-made environments: looking for syrup in all the right places , 1994 .

[43]  Charles R. Dyer,et al.  Long-range spatiotemporal motion understanding using spatiotemporal flow curves , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Jake K. Aggarwal,et al.  Visually Interpreting the Motion of Objects in Space , 1981, Computer.

[45]  Alan L. Yuille,et al.  Feature extraction from faces using deformable templates , 2004, International Journal of Computer Vision.

[46]  Gudula Retz-Schmidt,et al.  A REPLAI of SOCCER: Recognizing Intentions in the Domain of Soccer Games , 1988, ECAI.

[47]  Tim Ellis,et al.  Detecting and Classifying Intruders in Image Sequences , 1991 .