Detecting Surrounding Users by Reverberation Analysis with a Smart Speaker and Microphone Array

Recently, smart speakers like Amazon Echo and Google Home have been spread widely. Those devices support users' life through voice interface by receiving voice commands to operate appliances and order goods to online shops. Meanwhile, it is reported that smart speakers are vulnerable to some malicious attacks which steal personal information and/or order unnecessary goods by uttering voice from a device nearby the speaker, abusing the fact that the smart speakers cannot distinguish human voice from machine voice. A new type of attack called DolphinAttack which utters ultrasonic voice inaudible to human is also reported. Therefore, a method to identify which of human or machine is sending voice commands to a smart speaker is desired. In this paper, to prevent such machine-voice based attacks to a smart speaker in absence of residents, we propose a system consisting of a speaker and a microphone array to detect the existence of a human nearby, supposing it can be incorporated in a smart speaker in the future. In our proposed system, the speaker emits sonar sound generated based on Orthogonal Frequency Division Multiplexing (OFDM) in all directions, the microphone array with 8 channels attached on top of the speaker receives the reflected sound, and the human existence is judged by comparing the reflected sound with that measured in the same environment without human. Through experiments with a prototype system, we confirmed that our proposed system can detect the human existence by measuring the reflected signal of 0.5 second.