Incorporating Side Information by Adaptive Convolution – Supplementary Material

Camera extrinsic parameters, such as camera tilt angle and camera height, are useful types of side information for crowd counting. The camera tilt angle affects a person’s appearance, while the distance affects the scale. Note that the camera tilt angle/height can be used to estimate the perspective map, whose values indicate the size of a person appearing at each location. As existing datasets do not contain the side information of extrinsic camera parameters, here we collect a new dataset consisting of indoor/outdoor scenes from various camera angles and heights. The scenes were captured using a smartphone camera, which was placed on a tripod to keep it stable. The camera tilt angle was recorded using the accelerometer tilt-sensor of the smartphone, and the height of the camera to the ground-plane was measured using a laser range finder. The perspective map for each scene is estimated from the camera extrinsic parameters.

[1]  Tien Tsin,et al.  Image Partial Blur Detection and Classification , 2013 .

[2]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jesús Chamorro-Martínez,et al.  Diatom autofocusing in brightfield microscopy: a comparative study , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[5]  Timothy Dozat,et al.  Incorporating Nesterov Momentum into Adam , 2016 .