Slow visual search in a fast-changing world

Attention focusing mechanisms and domain-informed selection of representations can make real-time vision tasks work with limited computational power. The paper describes ongoing work in distributed real-time vision which aims to use cheap and plentiful workstations and PCs rather than special-purpose hardware. I discuss a system called ARGUS which is inspired by the visual routines theory of human vision. In ARGUS, reactive feature tracking agents maintain minimal, task-dependent descriptions of relevant image features by direct observation of the live video stream. Routines for model-based object recognition operate on these descriptions. Higher-level processing is independent of the maintenance of lower-level representations. This allows the visual subsystem to provide real-time feedback for closed-loop tasks even when high-level perceptual processing is slow compared to video frame rates. Experiments in moving-object recognition are described which demonstrate the strength of this approach in situations where the perceived scene is changing faster than high-level analysis can categorize it.