Large Scale Organization and Inference of an Imagery Dataset for Public Safety

Video applications and analytics are routinely projected as a stressing and significant service of the Nationwide Public Safety Broadband Network. As part of a NIST PSCR funded effort, the New Jersey Office of Homeland Security and Preparedness and MIT Lincoln Laboratory have been developing a computer vision dataset of operational and representative public safety scenarios. The scale and scope of this dataset necessitates a hierarchical organization approach for efficient compute and storage. We overview architectural considerations using the Lincoln Laboratory Supercomputing Cluster as a test architecture. We then describe how we intelligently organized the dataset across LLSC and evaluated it with large scale imagery inference across terabytes of data.

[1]  Jeremy Kepner,et al.  Dynamic distributed dimensional data model (D4M) database and computation system , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Apostol Natsev,et al.  YouTube-8M: A Large-Scale Video Classification Benchmark , 2016, ArXiv.

[3]  Jeremy Kepner,et al.  D4M 2.0 schema: A general purpose high performance schema for the Accumulo database , 2013, 2013 IEEE High Performance Extreme Computing Conference (HPEC).

[4]  Marc Leh,et al.  Public Safety Analytics R&D Roadmap , 2016 .

[5]  Gerd Heber,et al.  An overview of the HDF5 technology suite and its applications , 2011, AD '11.

[6]  Jeremy Kepner,et al.  Driving big data with big compute , 2012, 2012 IEEE Conference on High Performance Extreme Computing.

[7]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Simson L. Garfinkel,et al.  First Workshop on Video Analytics in Public Safety | NIST , 2017 .

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  Andrew Weinert,et al.  Outreach to Define a Public Safety Communications Model For Broadband Cellular Video , 2018, 2018 IEEE International Symposium on Technologies for Homeland Security (HST).

[11]  Greg Mori,et al.  Building Damage Assessment Using Deep Learning and Ground-Level Image Data , 2017, 2017 14th Conference on Computer and Robot Vision (CRV).

[12]  Jeremy Kepner,et al.  Lustre, hadoop, accumulo , 2015, 2015 IEEE High Performance Extreme Computing Conference (HPEC).

[13]  Deva Ramanan,et al.  Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces , 2010, ECCV.

[14]  Aleksandrs Slivkins,et al.  Incentivizing high quality crowdwork , 2015, SECO.

[15]  Vijay Gadepally,et al.  High-throughput ingest of data provenance records into Accumulo , 2016, 2016 IEEE High Performance Extreme Computing Conference (HPEC).

[16]  James H. Martin,et al.  A vision for technology-mediated support for public participation & assistance in mass emergencies & disasters , 2010 .

[17]  Jeremy Kepner,et al.  Interactive Supercomputing on 40,000 Cores for Machine Learning and Data Analysis , 2018, 2018 IEEE High Performance extreme Computing Conference (HPEC).