Classifier Grouping to Enhance Data Locality for a Multi-threaded Object Detection Algorithm

Object detection has become an enabling function for modern smart embedded devices to perform intelligent applications and interact with the environment appropriately and promptly. However, the limited computation resource of embedded devices has become a barrier to execute the computation intensive object detection algorithm. Leveraging the multi-threading scheme on embedded multi-core systems provides an opportunity to boost the performance. However, the memory bottleneck limits the performance scalability. Improving data locality of applications and maximizing the data reuse for on-chip caches have therefore become critical design concerns. This paper comprehensively analyzes the memory behavior and data locality of a multi-threaded object detection algorithm. A novel Classifier-Grouping scheme is proposed to significantly enhance the data reuse for on-chip caches of embedded multicore systems. By executing a multi-threaded object detection algorithm on a cycle-accurate multi-core simulator, the proposed approach can achieve up to 62% better performance when compared with the original parallel program.

[1]  Yu Wei,et al.  FPGA implementation of AdaBoost algorithm for detection of face biometrics , 2004, IEEE International Workshop on Biomedical Circuits and Systems, 2004..

[2]  Shih-Lien Lu,et al.  Novel FPGA based Haar classifier face detection algorithm acceleration , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[3]  Bo-Cheng Charles Lai,et al.  Multi-level parallelism analysis of face detection on a shared memory multi-core system , 2011, Proceedings of 2011 International Symposium on VLSI Design, Automation and Test.

[4]  Daniel Snow,et al.  Pedestrian detection using boosted features over many frames , 2008, 2008 19th International Conference on Pattern Recognition.

[5]  Horst Bischof,et al.  On-line Boosting for Car Detection from Aerial Images , 2007, 2007 IEEE International Conference on Research, Innovation and Vision for the Future.

[6]  Robert Ulichney,et al.  Automatic red-eye detection and correction , 2002, Proceedings. International Conference on Image Processing.

[7]  David R. Keppel,et al.  Tools and Techniques for Building Fast Portable Threads Packages , 1993 .

[8]  Patrick Schaumont,et al.  Cooperative multithreading on embedded multiprocessor architectures enables energy-scalable design , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[9]  Narayanan Vijaykrishnan,et al.  A parallel architecture for hardware face detection , 2006, IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures (ISVLSI'06).

[10]  Yen-Kuang Chen,et al.  Parallelization of AdaBoost algorithm on multi-core processors , 2008, 2008 IEEE Workshop on Signal Processing Systems.

[11]  Ming Yang,et al.  Face detection for automatic exposure control in handheld camera , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[12]  Chih-Wei Liu,et al.  Parallel object detection on multicore platforms , 2009, 2009 IEEE Workshop on Signal Processing Systems.

[13]  George F. Riley,et al.  Round-robin Arbiter Design and Generation , 2002, 15th International Symposium on System Synthesis, 2002..

[14]  Monica S. Lam,et al.  A data locality optimizing algorithm , 1991, PLDI '91.

[15]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[16]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[17]  Albrecht Schmidt,et al.  Multi-Sensor Context-Awareness in Mobile Devices and Smart Artifacts , 2002, Mob. Networks Appl..

[18]  Apan Qasem,et al.  Balancing Locality and Parallelism on Shared-cache Mulit-core Systems , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.

[19]  Zhengyou Zhang,et al.  A Survey of Recent Advances in Face Detection , 2010 .