Multimedia Analysis in Police–Citizen Communication: Supporting Daily Policing Tasks

This chapter describes an approach for improved multimedia analysis as part of an ICT-based tool for community policing. It includes technology for automatic processing of audio, image and video contents sent as evidence by the citizens to the police. In addition to technical details of their development, results of their performance within initial pilots simulating nearly real crime situations are presented and discussed.

[1]  R. Maher,et al.  Audio forensic examination , 2009, IEEE Signal Processing Magazine.

[2]  Rita Cucchiara,et al.  3DPeS: 3D people dataset for surveillance and forensics , 2011, J-HGBU '11.

[3]  Matthew H. Davis,et al.  Speech recognition in adverse conditions: A review , 2012 .

[4]  Kenneth Heafield,et al.  KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[5]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[6]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[7]  Hafiz Malik,et al.  Acoustic Environment Identification and Its Applications to Audio Forensics , 2013, IEEE Transactions on Information Forensics and Security.

[8]  Hafiz Malik,et al.  Digital audio forensics using background noise , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[9]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[10]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Sanjeev Khudanpur,et al.  Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[13]  Albert Ali Salah,et al.  Efficient large-scale action recognition in videos using extreme learning machines , 2015, Expert Syst. Appl..

[14]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[15]  Hermann Ney,et al.  Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[16]  Aline Roumy,et al.  Video super-resolution via sparse combinations of key-frame patches in a compression context , 2013, 2013 Picture Coding Symposium (PCS).

[17]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[18]  Li Fei-Fei,et al.  DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Carlo Aliprandi,et al.  SAVAS: Collecting, Annotating and Sharing Audiovisual Language Resources for Automatic Subtitling , 2014, LREC.

[20]  Daniel Povey,et al.  The Kaldi Speech Recognition Toolkit , 2011 .

[21]  Shujun Li,et al.  Forensic Authentication of Digital Audio and Video Files , 2015 .

[22]  Zhenan Sun,et al.  A Lightened CNN for Deep Face Representation , 2015, ArXiv.

[23]  Shaogang Gong,et al.  Towards Open-World Person Re-Identification by One-Shot Group-Based Verification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Driss Matrouf,et al.  Forensic speaker recognition , 2009, IEEE Signal Process. Mag..

[25]  Daniel Garcia-Romero,et al.  Speech forensics: Automatic acquisition device identification. , 2010 .

[26]  Tanel Alumäe,et al.  LSTM for punctuation restoration in speech transcripts , 2015, INTERSPEECH.

[27]  Murat Saraclar,et al.  Lattice Indexing for Spoken Term Detection , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Vittorio Murino,et al.  Custom Pictorial Structures for Re-identification , 2011, BMVC.

[29]  Erik Learned-Miller,et al.  FDDB: A benchmark for face detection in unconstrained settings , 2010 .