An architecture for online semantic labeling on UGVs

We describe an architecture to provide online semantic labeling capabilities to field robots operating in urban environments. At the core of our system is the stacked hierarchical classifier developed by Munoz et al., which classifies regions in monocular color images using models derived from hand labeled training data. The classifier is trained to identify buildings, several kinds of hard surfaces, grass, trees, and sky. When taking this algorithm into the real world, practical concerns with difficult and varying lighting conditions require careful control of the imaging process. First, camera exposure is controlled by software, examining all of the image's pixels, to compensate for the poorly performing, simplistic algorithm used on the camera. Second, by merging multiple images taken with different exposure times, we are able to synthesize images with higher dynamic range than the ones produced by the sensor itself. The sensor 's limited dynamic range makes it difficult to, at the same time, properly expose areas in shadow along with high albedo surfaces that are directly illuminated by the sun. Texture is a key feature used by the classifier, and under /over exposed regions lacking texture are a leading cause of misclassifications. The results of the classifier are shared with higher-lev elements operating in the UGV in order to perform tasks such as building identification from a distance and finding traversable surfaces.

[1]  Robert L. Stevenson,et al.  Dynamic range improvement through multiple exposures , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[2]  Martial Hebert,et al.  Stacked Hierarchical Labeling , 2010, ECCV.

[3]  Natasha Gelfand,et al.  Multi-exposure imaging on mobile devices , 2010, ACM Multimedia.

[4]  Martial Hebert,et al.  Enhancing robot perception using human teammates , 2013, AAMAS.

[5]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[6]  Richard Szeliski,et al.  High dynamic range video , 2003, ACM Trans. Graph..

[7]  Thomas S. Huang,et al.  Image processing , 1971 .

[8]  Min Chen,et al.  Tone Mapping for HDR Image using Optimization A New Closed Form Solution , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[9]  Alberto Del Bimbo,et al.  Proceedings of the international conference on Multimedia , 2010 .

[10]  Milan Sonka,et al.  Image Processing, Analysis and Machine Vision , 1993, Springer US.