论文信息 - Federated tool for anonymization and annotation in image data

Federated tool for anonymization and annotation in image data

The increasing complexity of security challenges requires Law Enforcement Agencies (LEAs) to have improved analysis capabilities, e.g., with the use of Artificial Intelligence (AI). However, it is challenging to make large enough high-quality training and testing datasets available to the community that is developing AI tools to support LEAs in their daily work. Due to legal and ethical issues, it is often undesirable to share raw data with personal information. These issues can lead to a chicken-egg problem, where annotation/anonymization and development of an AI tool depend on each other. This paper presents a federated tool for semi-automatic anonymization and annotation that facilitates the sharing of AI models and anonymized data without sharing raw data with personal information. The tool uses federated learning to jointly train object detection models to reach higher performance by combining the annotation efforts of multiple organizations. These models are used to assist a person to anonymize or annotate image data more efficiently with human oversight. The results show that our privacy-enhancing federated approach – where only models are shared – is almost as good as a centralized approach with access to all data.

Sabina B. van Rooij | H. Bouma | J. van Mil | Johan-Martijn ten Hove

[1] Chien-Yao Wang,et al. You Only Learn One Representation: Unified Network for Multiple Tasks , 2021, J. Inf. Sci. Eng..

[2] Upesh Nepal,et al. Comparing YOLOv3, YOLOv4 and YOLOv5 for Autonomous Landing Spot Detection in Faulty UAVs , 2022, Sensors.

[3] Yuanjie Zheng,et al. LogoDet-3K: A Large-scale Image Dataset for Logo Detection , 2020, ACM Trans. Multim. Comput. Commun. Appl..

[4] Henri Bouma,et al. Authentication of travel and breeder documents , 2021, Security + Defence.

[5] Jianzhong Qi,et al. Federated Learning with Fair Averaging , 2021, IJCAI.

[6] Chien-Yao Wang,et al. Scaled-YOLOv4: Scaling Cross Stage Partial Network , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Henri Bouma,et al. Document anonymization for border guards and immigration services , 2020, Security + Defence.

[8] Daniel J. Beutel,et al. Flower: A Friendly Federated Learning Research Framework , 2020, 2007.14390.

[9] Tianjian Chen,et al. FedVision: An Online Visual Object Detection Platform Powered by Federated Learning , 2020, AAAI.

[10] Anit Kumar Sahu,et al. Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[11] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Henri Bouma,et al. Rapid Annotation Tool to Train Novel Concept Detectors with Active Learning , 2019, MMEDIA 2019.

[13] Sebastian Caldas,et al. LEAF: A Benchmark for Federated Settings , 2018, ArXiv.

[14] Daniel Rueckert,et al. A generic framework for privacy preserving deep learning , 2018, ArXiv.

[15] William J. Dally,et al. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training , 2017, ICLR.

[16] Henri Bouma,et al. Automatic analysis of online image data for law enforcement agencies by concept detection and instance search , 2017, Security + Defence.

[17] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[18] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[20] Rainer Lienhart,et al. Scalable logo recognition in real-world images , 2011, ICMR.