EchoSafe: Sonar-based Verifiable Interaction with Intelligent Digital Agents

Voice controlled interactive smart speakers, such as Google Home, Amazon Echo, and Apple HomePod are becoming commonplace in today's homes. These devices listen continually for the user commands, that are triggered by special keywords, such as "Alexa" and "Hey Siri". Recent research has shown that these devices are vulnerable to attacks through malicious voice commands from nearby devices. The commands can be sent easily during unoccupied periods, so that the user may be unaware of such attacks. We present EchoSafe, a user-friendly sonar-based defense against these attacks. When the user sends a critical command to the smart speaker, EchoSafe sends an audio pulse followed by post processing to determine if the user is present in the room. We can detect the user's presence during critical commands with 93.13% accuracy, and our solution can be extended to defend against other attack scenarios, as well.

[1]  Marco Cristani,et al.  Infinite Feature Selection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Desney S. Tan,et al.  FingerIO: Using Active Sonar for Fine-Grained Finger Tracking , 2016, CHI.

[3]  Martin Vetterli,et al.  Acoustic echoes reveal room shape , 2013, Proceedings of the National Academy of Sciences.

[4]  Jiawei Han,et al.  Generalized Fisher Score for Feature Selection , 2011, UAI.

[5]  Pablo M. Granitto,et al.  SVM Based Feature Selection: Why Are We Using the Dual? , 2010, IBERAMIA.

[6]  John Krumm,et al.  PreHeat: controlling home heating using occupancy prediction , 2011, UbiComp '11.

[7]  Simone Melzi,et al.  Online Feature Selection for Visual Tracking , 2016, BMVC.

[8]  Simone Melzi,et al.  Ranking to Learn: - Feature Ranking and Selection via Eigenvector Centrality , 2016, NFMCP@PKDD/ECML.

[9]  Zi Huang,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .

[10]  Gierad Laput,et al.  Synthetic Sensors: Towards General-Purpose Sensing , 2017, CHI.

[11]  Kamin Whitehouse,et al.  The smart thermostat: using occupancy sensors to save energy in homes , 2010, SenSys '10.

[12]  Rhys Goldstein,et al.  Real-time occupancy detection using decision trees with multiple sensor types , 2011, SpringSim.

[13]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[14]  Romit Roy Choudhury,et al.  BackDoor: Making Microphones Hear Inaudible Sounds , 2017, MobiSys.

[15]  Hiroshi Motoda,et al.  Computational Methods of Feature Selection , 2022 .

[16]  Micah Sherr,et al.  Hidden Voice Commands , 2016, USENIX Security Symposium.

[17]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[18]  Anthony Rowe,et al.  Occupancy estimation using ultrasonic chirps , 2015, ICCPS.

[19]  Thomas Weng,et al.  Occupancy-driven energy management for smart building automation , 2010, BuildSys '10.

[20]  Yiu-ming Cheung,et al.  Feature Selection and Kernel Learning for Local Learning-Based Clustering , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Xiangyu Liu,et al.  Your Voice Assistant is Mine: How to Abuse Speakers to Steal Information and Control Your Phone , 2014, SPSM@CCS.

[22]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .