VIDEO ANALYTICS IN RETAIL ENVIRONMENT

This investigation for video analytics in a retail environment involved: Analyzing video feeds, provided by PrismSkyLabs, of a business day in a creamery. The analysis involved extraction of the foreground based on motion estimation/filtering and training of a person classifier based on images acquired from Google's search engine of people and common items found in a creamery. The motion based foreground estimator is based upon binary hypothesis testing with temporal filtering to discount transient motion as being mistaken for occupancy. Occupancy detection is accomplished by analyzing accumulated motion statistics. Upon occupancy declaration a k-NN classification algorithm is applied to classify potential occupancy by people or other items. The system model is based on a feature vector containing spatial coordinates , spatial change of illumination for each region of interest, as well as the Laplacian to detect edges. Rather than apply the conventional vector metric approach , our system focuses on region covariance matrices. A consequence of this fact, is that a proper metric is required to be able to compare distance between covari-ance matrices of regions. We employ a metric based on generalized eigenvalues of the covariance matrix. Two methods of validation were performed, (1) involved a leave part out cross validation approach in which the training dataset is randomly split into a new training set and a new query set. These two sets are then applied to the classification algorithm; and accuracy results are obtained. The same experiment was repeated ten times (10-fold cross-validation), the results of which were between 89% and 93% successful classification rate. (2)There is also an approach to validation based on the creamery videos, in which the entire training set is applied to the creamery video frames at the user determined regions of interest. The results of the real time validation ranged from 74.5% to 98.1% for motion analysis methods and 54.3% to 63.4% for region covariance methods.The analytics collected on the regions where people are identified are mainly the occupancy time. In conclusion, the proposed system performs fairly well but may be able to be improved by incorporating different features into the system model or applying a more mature classification system like a support vector machine or neural network.

[1]  Trevor Darrell,et al.  Integrated Person Tracking Using Stereo, Color, and Pattern Detection , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[2]  Trevor Darrell,et al.  Integrated Person Tracking Using Stereo, Color, and Pattern Detection , 2000, International Journal of Computer Vision.

[3]  Alina A. von Davier,et al.  Cross-Validation , 2014 .

[4]  W. Förstner,et al.  A Metric for Covariance Matrices , 2003 .

[5]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[6]  James M. Rehg,et al.  Vision for a smart kiosk , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Larry S. Davis,et al.  W4: Real-Time Surveillance of People and Their Activities , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Mary Czerwinski,et al.  The New EasyLiving Project at Microsoft Research , 1998 .

[9]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[10]  Hironobu Fujiyoshi,et al.  Moving target classification and tracking from real-time video , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[11]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..