Modelling Peripheral Pre-Attention And Foveal Fixation For Search Directed Machine Vision Systems

The human visual system has evolved towards a close integration of visual information processing and visual data acquisition. Fast, peripheral, pre-attentive vision uses low resolution input to direct the fixation of the fovea to features of importance in an efficient visual search pattern. Here we describe a system which emulates the multi-resolution aspect of human visual processing to provide computational efficiency in data analysis. The visual task used is the location of specific features in human faces for use in videotelephony. The feature location technique uses a Kohonen-based neural network architecture to permit learning by example. Input data is in the form of a resolution pyramid to emulate the differing modes of human vision. The system is implemented on a RISC-based microcomputer workstation with purpose-built real-time image acquisition hardware. It performs well with both familiar and unseen image data and, with refinement, could form the basis of a useable system.