Coarse-to-Fine Image Search Using Neural Networks

The efficiency of image search can be greatly improved by using a coarse-to-fine search strategy with a multi-resolution image representation. However, if the resolution is so low that the objects have few distinguishing features, search becomes difficult. We show that the performance of search at such low resolutions can be improved by using context information, i.e., objects visible at low-resolution which are not the objects of interest but are associated with them. The networks can be given explicit context information as inputs, or they can learn to detect the context objects, in which case the user does not have to be aware of their existence. We also use Integrated Feature Pyramids, which represent high-frequency information at low resolutions. The use of multiresolution search techniques allows us to combine information about the appearance of the objects on many scales in an efficient way. A natural form of exemplar selection also arises from these techniques. We illustrate these ideas by training hierarchical systems of neural networks to find clusters of buildings in aerial photographs of farmland.

[1]  Peter J. Burt,et al.  Smart sensing within a pyramid vision machine , 1988, Proc. IEEE.

[2]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  D. J. Burr,et al.  Hierarchical recurrent networks for learning musical structure , 1993, Neural Networks for Signal Processing III - Proceedings of the 1993 IEEE-SP Workshop.

[4]  Michael C. Mozer,et al.  Neural Network Music Composition by Prediction: Exploring the Benefits of Psychoacoustic Constraints and Multi-scale Processing , 1994, Connect. Sci..

[5]  Peter J. Burt,et al.  Attention mechanisms for vision in a dynamic world , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[6]  D. Ballard,et al.  Object recognition using steerable filters at multiple scales , 1993, [1993] Proceedings IEEE Workshop on Qualitative Vision.

[7]  Peter J. Burt,et al.  Object tracking with a moving camera , 1989, [1989] Proceedings. Workshop on Visual Motion.

[8]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..