Using relative spatial relationships to improve individual region recognition

As well as words in text processing, image regions are polysemic and need some disambiguation. If the set of representations of two different objects are close or intersecting, a region that is in the intersection will be recognized as being possibly both objects. We propose here a way to disambiguate regions using some knowledge on relative spatial positions between these regions. Given a segmented image with a list of possible objects for each region, the objective is to find the best set of objects that fits the knowledge. A consistency function is constructed that attributes a score to a spatial arrangement of objects in the image. The proposed algorithm is demonstrated on an example where we try to recognize backgrounds (sky, water, snow, trees, grass, sand, ground, buildings) in images. An evaluation over a database of 10000 images shows that we can reduce the number of false positive while keeping almost the same recognition rate.