Requirements for multimedia metadata schemes in surveillance applications for security

Surveillance for security requires communication between systems and humans, involves behavioural and multimedia research, and demands an objective benchmarking for the performance of system components. Metadata representation schemes are extremely important to facilitate (system) interoperability and to define ground truth annotations for surveillance research and benchmarks. Surveillance places specific requirements on these metadata representation schemes. This paper offers a clear and coherent terminology, and uses this to present these requirements and to evaluate them in three ways: their fitness in breadth for surveillance design patterns, their fitness in depth for a specific surveillance scenario, and their realism on the basis of existing schemes. It is also validated that no existing metadata representation scheme fulfils all requirements. Guidelines are offered to those who wish to select or create a metadata scheme for surveillance for security.

[1]  Leon J. H. M. Kester,et al.  Designing Networked Adaptive Interactive Hybrid Systems , 2008, 2008 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems.

[2]  Michael Jackman,et al.  Conceptual graphs , 1988 .

[3]  Robert Suzic,et al.  A generic model of tactical plan recognition for threat assessment , 2005, SPIE Defense + Commercial Sensing.

[4]  Sergio A. Velastin,et al.  A profile of MPEG-7 for visual surveillance , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[5]  Ramesh Jain,et al.  Toward a Common Event Model for Multimedia Applications , 2007, IEEE MultiMedia.

[6]  Georges Quénot,et al.  TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics , 2011, TRECVID.

[7]  Sukhan Lee,et al.  Multisensor Fusion and Integration for Intelligent Systems , 2009 .

[8]  John F. Sowa,et al.  Conceptual Structures: Information Processing in Mind and Machine , 1983 .

[9]  Werner Bailer,et al.  SAM: an interoperable metadata model for multimodal surveillance applications , 2009, Defense + Commercial Sensing.

[10]  Robert B. Fisher,et al.  The PETS04 Surveillance Ground-Truth Data Sets , 2004 .

[11]  Jan-Willem Marck,et al.  Reasoning About Threats: From Observables to Situation Assessment , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[12]  Murray Silverstein,et al.  A Pattern Language , 1977 .

[13]  John F. Sowa,et al.  Conceptual Graphs for a Data Base Interface , 1976, IBM J. Res. Dev..

[14]  Alan N. Steinberg,et al.  Revisions to the JDL data fusion model , 1999, Defense, Security, and Sensing.

[15]  John F. Sowa,et al.  Handbook of Knowledge Representation Edited Conceptual Graphs 5.1 from Existential Graphs to Conceptual Graphs , 2022 .

[16]  Jin Hyeong Park,et al.  Performance evaluation of object detection algorithms , 2002, Object recognition supported by user interaction for service robots.

[17]  J. Crowley,et al.  CAVIAR Context Aware Vision using Image-based Active Recognition , 2005 .

[18]  i-LIDS Team,et al.  Imagery Library for Intelligent Detection Systems (i-LIDS); A Standard for Testing Video Based Detection Systems , 2006, Proceedings 40th Annual 2006 International Carnahan Conference on Security Technology.

[19]  Henri Bouma,et al.  Behavioral profiling in CCTV cameras by combining multiple subtle suspicious observations of different surveillance operators , 2013, Defense, Security, and Sensing.

[20]  François Brémond,et al.  ETISEO, performance evaluation for video surveillance systems , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[21]  Robert S. Belvin,et al.  Modeling threat behaviors in surveillance video metadata for detection using an Analogical Reasoner , 2010, 2010 IEEE Aerospace Conference.

[22]  Peter Sommerlad,et al.  Pattern-Oriented Software Architecture Volume 1: A System of Patterns , 1996 .

[23]  Ramanathan V. Guha,et al.  Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project , 1990 .

[24]  Mohan M. Trivedi,et al.  Robust real-time detection, tracking, and pose estimation of faces in video streams , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[25]  B. F. Castro Buschmann, Frank; Meunier, Regine; Rohnert, Hans; Sommerlad, Peter; Stal, Michael. Pattern-oriented software architecture: a system of patterns, John Wiley & Sons Ltd, 1996 , 1997 .

[26]  D. Lyon Surveillance Studies: An Overview , 2007 .

[27]  Max Jacobson,et al.  A Pattern Language: Towns, Buildings, Construction , 1981 .

[28]  Adam Pease,et al.  Towards a standard upper ontology , 2001, FOIS.

[29]  Álvaro García-Martín,et al.  An Ontology for Event Detection and its Application in Surveillance Video , 2009, AVSS.

[30]  Robert B. Fisher,et al.  CVML - an XML-based computer vision markup language , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[31]  David S. Doermann,et al.  Tools and techniques for video performance evaluation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[32]  Ramakant Nevatia,et al.  VERL: An Ontology Framework for Representing and Annotating Video Events , 2005, IEEE Multim..