A 1920 × 1080 30-frames/s 2.3 TOPS/W Stereo-Depth Processor for Energy-Efficient Autonomous Navigation of Micro Aerial Vehicles

This paper presents a single-chip, high-performance, and energy-efficient stereo vision depth-estimation processor for micro aerial vehicles (MAVs). The proposed processor implements the state-of-the-art semi-global matching (SGM) algorithm to deliver full high-definition (HD, 1920 ${\times }$ 1080) stereo-depth outputs with a maximum of 38 frames/s throughput. Algorithm-architecture co-optimization is conducted, introducing overlapping block-based processing that eliminates very large on-chip memory and off-chip DRAM. We exploit inherent data parallelism in the algorithm by processing 128 local disparity costs and aggregating the SGM costs along four paths for all 128 disparities in parallel. A dependence-resolving scan associated with 16-stage deep pipeline is introduced to hide the data dependence between neighboring pixels in the SGM algorithm. Moreover, we propose a customized ultra-high bandwidth dual-port SRAM that utilizes the unique memory access characteristic of SGM to achieve highly energy-efficient memory access at a very high on-chip memory bandwidth of 1.64 Tb/s. The fabricated processor produces 512 levels of depth information for each pixel at full HD resolution with 30-frames/s performance, consuming 836 mW from a 0.75-V supply in TSMC 40-nm GP CMOS. We ported the design on a quadcopter MAV to demonstrate its performance in realistic real-time flight.

[1]  Ben M. Chen,et al.  An MAV Localization and Mapping System Based on Dual Realsense Cameras , 2016 .

[2]  Chaitali Chakrabarti,et al.  Hardware-Efficient Neighbor-Guided SGM Optical Flow for Low Power Vision Applications , 2016, 2016 IEEE International Workshop on Signal Processing Systems (SiPS).

[3]  J. Wenger,et al.  Automotive radar - status and perspectives , 2005, IEEE Compound Semiconductor Integrated Circuit Symposium, 2005. CSIC '05..

[4]  Liang-Gee Chen,et al.  23.2 A 1920×1080 30fps 611 mW five-view depth-estimation processor for light-field applications , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[5]  Darius Burschka,et al.  Advances in Computational Stereo , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Bryan A. Chin,et al.  Infrared sensing techniques for penetration depth control of the submerged arc welding process , 2001 .

[7]  Takeo Kanade,et al.  Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  B. V. K. Vijaya Kumar,et al.  A multi-sensor fusion system for moving object detection and tracking in urban driving environments , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Stefan K. Gehrig,et al.  A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching , 2009, ICVS.

[10]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[11]  H. Hirschmüller Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information , 2005, CVPR.

[12]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Liang-Gee Chen,et al.  Architecture design of stereo matching using belief propagation , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[14]  Harvey F. Silverman,et al.  A Class of Algorithms for Fast Digital Image Registration , 1972, IEEE Transactions on Computers.

[15]  F. S. Vinson,et al.  A pulsed Doppler ultrasonic system for making noninvasive measurements of the mechanical properties of soft tissue. , 1987, Journal of rehabilitation research and development.

[16]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Antonio M. López,et al.  Embedded Real-time Stereo Estimation via Semi-Global Matching on the GPU , 2016, ICCS.

[18]  Sergiu Nedevschi,et al.  SORT-SGM: Subpixel Optimized Real-Time Semiglobal Matching for Intelligent Vehicles , 2012, IEEE Transactions on Vehicular Technology.

[19]  Tian-Sheuan Chang,et al.  Low Memory Cost Block-Based Belief Propagation for Stereo Correspondence , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[20]  Yutaka Yamada,et al.  18.2 A 1.9TOPS and 564GOPS/W heterogeneous multicore SoC with color-based object classification accelerator for image-recognition applications , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[21]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[22]  D. F. Kostishack,et al.  Micro Air Vehicles for Optical Surveillance , 1999 .

[23]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Hoi-Jun Yoo,et al.  A 502-GOPS and 0.984-mW Dual-Mode Intelligent ADAS SoC With Real-Time Semiglobal Matching and Intention Prediction for Smart Automotive Black Box System , 2017, IEEE Journal of Solid-State Circuits.

[25]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Maximilian Buder,et al.  Memory Efficient Semi-Global Matching , 2012 .

[27]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[28]  Hyunki Kim,et al.  14.2 A 502GOPS and 0.984mW dual-mode ADAS SoC with RNN-FIS engine for intention prediction in automotive black-box system , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).

[29]  Youchang Kim,et al.  A 646GOPS/W multi-classifier many-core processor with cortex-like architecture for super-resolution recognition , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.

[30]  David Blaauw,et al.  Low complexity optical flow using neighbor-guided semi-global matching , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[31]  Chao Chen,et al.  Real-time Architecture of Stereo Vision for Robot Eye , 2006, 2006 8th international Conference on Signal Processing.

[32]  M. Kameyama,et al.  VLSI processor for reliable stereo matching based on window-parallel logic-in-memory architecture , 2004, 2004 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.04CH37525).

[33]  David Blaauw,et al.  3.7 A 1920×1080 30fps 2.3TOPS/W stereo-depth processor for robust autonomous navigation , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[34]  Hoi-Jun Yoo,et al.  A 30fps stereo matching processor based on belief propagation with disparity-parallel PE array architecture , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.