Automated Anonymisation of Visual and Audio Data in Classroom Studies

Understanding students' and teachers' verbal and non-verbal behaviours during instruction may help infer valuable information regarding the quality of teaching. In education research, there have been many studies that aim to measure students' attentional focus on learning-related tasks: Based on audio-visual recordings and manual or automated ratings of behaviours of teachers and students. Student data is, however, highly sensitive. Therefore, ensuring high standards of data protection and privacy has the utmost importance in current practices. For example, in the context of teaching management studies, data collection is carried out with the consent of pupils, parents, teachers and school administrations. Nevertheless, there may often be students whose data cannot be used for research purposes. Excluding these students from the classroom is an unnatural intrusion into the organisation of the classroom. A possible solution would be to request permission to record the audio-visual recordings of all students (including those who do not voluntarily participate in the study) and to anonymise their data. Yet, the manual anonymisation of audio-visual data is very demanding. In this study, we examine the use of artificial intelligence methods to automatically anonymise the visual and audio data of a particular person.

[1]  Shuo Yang,et al.  WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[4]  Selbstvertrauen und schulische Leistungen , 1992 .

[5]  Abhishek Dutta,et al.  The VIA Annotation Software for Images, Audio and Video , 2019, ACM Multimedia.

[6]  Charles Elkan,et al.  Differential Privacy and Machine Learning: a Survey and Review , 2014, ArXiv.

[7]  Peter Gerjets,et al.  Teachers' Perception in the Classroom , 2018, CVPR Workshops.

[8]  Yong Jae Lee,et al.  Learning to Anonymize Faces for Privacy Preserving Action Detection , 2018, ECCV.

[9]  Junichi Yamagishi,et al.  Speaker Anonymization Using X-vector and Neural Waveform Models , 2019, 10th ISCA Workshop on Speech Synthesis (SSW 10).

[10]  Jakub Lokoc,et al.  TransNet: A deep network for fast detection of common shot transitions , 2019, ArXiv.

[11]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[12]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[13]  Anil K. Jain,et al.  Soft Biometric Traits for Personal Recognition Systems , 2004, ICBA.

[14]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Stefanos Zafeiriou,et al.  RetinaFace: Single-stage Dense Face Localisation in the Wild , 2019, ArXiv.

[16]  Omkar M. Parkhi,et al.  VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[17]  Weiqin Chen,et al.  Privacy and data protection in learning analytics should be motivated by an educational maxim—towards a proposal , 2018, Research and Practice in Technology Enhanced Learning.

[18]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[19]  Erik McDermott,et al.  Deep neural networks for small footprint text-dependent speaker verification , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Enkelejda Kasneci,et al.  Attentive or Not? Toward a Machine Learning Approach to Assessing Students’ Visible Engagement in Classroom Instruction , 2019, Educational Psychology Review.

[21]  Frank Lindseth,et al.  DeepPrivacy: A Generative Adversarial Network for Face Anonymization , 2019, ISVC.

[22]  Alan McCree,et al.  Speaker diarization using deep neural network embeddings , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Peter Wittenburg,et al.  Annotation by Category: ELAN and ISO DCR , 2008, LREC.

[24]  Quan Wang,et al.  Fully Supervised Speaker Diarization , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).