Visual attention-based approach for prediction of abnormalities in CCTV video surveillance

In a CCTV control room, operators’ attention is the key component to differentiate between benign and malignant scenarios. Such tasks often require the attention of multiple monitors, complex and constantly changing visual elements. Previous studies suggest that the operators’ performance declines with increase of cognitive load. To reduce this workload, we propose a novel method based on human-computer vision that would prioritize the monitors/cameras based on abnormalities in video streams and draws the attention of operators, if required. The method considers operators’ eye-fixation, their online estimation of threats and optical flow visual features. The proposed method automatically learned activities under weak supervision from the recorded eye-movements and threat assessments from multiple operators undertaking the monitoring task. In the vision-based visual surveillance applications, abnormality is often expressed as an action/event at an unusual region of a video, at an unusual time. It is quantitatively measured by computing the likelihood from the learnt model, which is trained using benign scenarios. Such supervised learning requires lots of training examples and can be reduced by including the operator’s knowledge. In the proposed work, the machine learning algorithm considers the unusual regions from the eye-fixations and the quantitative measure of abnormality from threat assessments of multiple operators. Our experimental evaluation shows the significant performance (Recall and Precision) of the prediction of abnormality in a dataset that consists of 10 different camera views and threat assessments from 11 operators.