ColAtt‐Net: In Reducing the Ambiguity of Pedestrian Orientations on Attribute‐Aware Semantic Segmentation Task

Semantic segmentation has become one of the trending topics in the world of computer vision and deep learning. Recently, due to an increasing demand to solve a semantic segmentation task simultaneously with attribute recognition of objects, a new task named attribute‐aware semantic segmentation has been introduced. Since the task requires to handle pixel‐wise object class estimation with its attributes such as a pedestrian's body orientation, previous works had difficulties to handle ambiguous attributes such as body orientations in object‐level, especially when segmenting the pedestrians with their attributes correctly. This paper proposes the ColAtt‐Net that is an attribute‐aware semantic segmentation model augmented by a column‐wise mask branch to predict the pedestrians' orientations in the horizontal perspective of the input image. We firmly assume that the pedestrians captured by a car‐mounted camera are distributed horizontally so that for each column of the input image, the pedestrian pixels can be labeled with one orientation uniformly. In the proposed method, we split the output of the base semantic segmentation model into two branches; one branch for segmenting the object categories, while the other one, as the novel column‐wise attribute branch, is to map the recognition of pedestrian's orientations that are distributed horizontally. This method successfully enhances the performance of attribute‐aware semantic segmentation by reducing the ambiguity on segmenting the pedestrian's orientation. Improvements on the pedestrian orientation segmentation are confidently shown by the proposed method in the experimental results, both in quantitative and qualitative views. This paper also discusses how the improved performance becomes an advantage in the autonomous driving system. © 2020 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.

[1]  Hiroshi Murase,et al.  Performance Boost of Attribute-aware Semantic Segmentation via Data Augmentation for Driver Assistance , 2020, 2020 8th International Conference on Information and Communication Technology (ICoICT).

[2]  M. D. Sulistiyo,et al.  Attribute-Aware Loss Function for Accurate Semantic Segmentation Considering the Pedestrian Orientations , 2020, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[3]  Hongsheng Yin,et al.  Deep Semantic Segmentation of Kidney and Space-Occupying Lesion Area Based on SCNN and ResNet Models Combined with SIFT-Flow Algorithm , 2018, Journal of Medical Systems.

[4]  Hironobu Fujiyoshi,et al.  Estimation of Driver's Insight for Safe Passing based on Pedestrian Attributes , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[5]  Haiqiang Chen,et al.  Road segmentation for all-day outdoor robot navigation , 2018, Neurocomputing.

[6]  Daisuke Deguchi,et al.  Attribute-aware Semantic Segmentation of Road Scenes for Understanding Pedestrian Orientations , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[7]  Junhee Seok,et al.  Indoor Semantic Segmentation for Robot Navigating on Mobile , 2018, 2018 Tenth International Conference on Ubiquitous and Future Networks (ICUFN).

[8]  Zhidong Deng,et al.  Recent progress in semantic image segmentation , 2018, Artificial Intelligence Review.

[9]  Shau-Shiun Jan,et al.  Combination of computer vision detection and segmentation for autonomous driving , 2018, 2018 IEEE/ION Position, Location and Navigation Symposium (PLANS).

[10]  Yu Liu,et al.  A review of semantic segmentation using deep neural networks , 2017, International Journal of Multimedia Information Retrieval.

[11]  Linhui Li,et al.  Traffic Scene Segmentation Based on RGB-D Image and Deep Learning , 2018, IEEE Transactions on Intelligent Transportation Systems.

[12]  Yu Cheng,et al.  Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing , 2018, ACM Multimedia.

[13]  Weichao Xu,et al.  Real-time object detection and semantic segmentation for autonomous driving , 2018, International Symposium on Multispectral Image Processing and Pattern Recognition.

[14]  George Papandreou,et al.  Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[15]  Carsten Rother,et al.  Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Peter Kontschieder,et al.  The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Cyrill Stachniss,et al.  Real-Time Semantic Segmentation of Crop and Weed for Precision Agriculture Robots Leveraging Background Knowledge in CNNs , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Seungmin Rho,et al.  Medical image semantic segmentation based on deep learning , 2017, Neural Computing and Applications.

[19]  George Papandreou,et al.  Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[20]  Xiaojuan Qi,et al.  ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.

[21]  José García Rodríguez,et al.  A Review on Deep Learning Techniques Applied to Semantic Segmentation , 2017, ArXiv.

[22]  Kang-Hyun Jo,et al.  Detection of pedestrian crossing road: A study on pedestrian pose recognition , 2017, Neurocomputing.

[23]  Yann LeCun,et al.  Predicting Deeper into the Future of Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25]  Ronald Kemker,et al.  Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.

[26]  Patrick van der Smagt,et al.  CNN-based Segmentation of Medical Imaging Data , 2017, ArXiv.

[27]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Hironobu Fujiyoshi,et al.  Misclassification tolerable learning for robust pedestrian orientation classification , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[29]  Bastian Leibe,et al.  Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Sepp Hochreiter,et al.  Speeding up Semantic Segmentation for Autonomous Driving , 2016 .

[31]  Patrick Heinemann,et al.  Context-based detection of pedestrian crossing intention for autonomous driving in urban environments , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[32]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Kaiqi Huang,et al.  A Richly Annotated Dataset for Pedestrian Attribute Recognition , 2016, ArXiv.

[35]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Kang-Hyun Jo,et al.  Detection of pedestrian crossing road , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[37]  Roberto Cipolla,et al.  Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding , 2015, BMVC.

[38]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Kaiqi Huang,et al.  Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[40]  Dariu Gavrila,et al.  A Probabilistic Framework for Joint Pedestrian Head and Body Orientation Estimation , 2015, IEEE Transactions on Intelligent Transportation Systems.

[41]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Xiaoou Tang,et al.  Pedestrian Attribute Recognition At Far Distance , 2014, ACM Multimedia.

[43]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[44]  Roberto Cipolla,et al.  Semantic object classes in video: A high-definition ground truth database , 2009, Pattern Recognit. Lett..

[45]  Hiroshi Murase,et al.  A Preliminary Study of Attribute-aware Semantic Segmentation for Pedestrian Understanding , 2017 .

[46]  Sebastian Ramos,et al.  The Cityscapes Dataset , 2015, CVPR 2015.