We Can Hear You with Wi-Fi!

Recent literature advances Wi-Fi signals to “see” people's motions and locations. This paper asks the following question: Can Wi-Fi “hear” our talks? We present WiHear, which enables Wi-Fi signals to “hear” our talks without deploying any devices. To achieve this, WiHear needs to detect and analyze fine-grained radio reflections from mouth movements. WiHear solves this micro-movement detection problem by introducing Mouth Motion Profile that leverages partial multipath effects and wavelet packet transformation. Since Wi-Fi signals do not require line-of-sight, WiHear can “hear” people talks within the radio range. Further, WiHear can simultaneously “hear” multiple people's talks leveraging MIMO technology. We implement WiHear on both USRP N210 platform and commercial Wi-Fi infrastructure. Results show that within our pre-defined vocabulary, WiHear can achieve detection accuracy of 91 percent on average for single individual speaking no more than six words and up to 74 percent for no more than three people talking simultaneously. Moreover, the detection accuracy can be further improved by deploying multiple receivers from different angles.

[1]  Ronald R. Coifman,et al.  Local discriminant bases and their applications , 1995, Journal of Mathematical Imaging and Vision.

[2]  Neal Patwari,et al.  Radio Tomographic Imaging with Wireless Networks , 2010, IEEE Transactions on Mobile Computing.

[3]  Torbjørn Svendsen,et al.  On the automatic segmentation of speech signals , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Lei Yang,et al.  3D beamforming for wireless data centers , 2011, HotNets-X.

[5]  Minyi Guo,et al.  TASA: Tag-Free Activity Sensing Using RFID Tag Arrays , 2011, IEEE Transactions on Parallel and Distributed Systems.

[6]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[7]  Romit Roy Choudhury,et al.  Using mobile phones to write in air , 2011, MobiSys '11.

[8]  Kate Ching-Ju Lin,et al.  Random access heterogeneous MIMO networks , 2011, SIGCOMM 2011.

[9]  R. Campbell,et al.  Hearing by eye : the psychology of lip-reading , 1988 .

[10]  Lawrence Wai-Choong Wong,et al.  Indoor localization with channel impulse response based fingerprint and nonparametric regression , 2010, IEEE Transactions on Wireless Communications.

[11]  Fadel Adib,et al.  See through walls with WiFi! , 2013, SIGCOMM.

[12]  Desney S. Tan,et al.  Humantenna: using the body as an antenna for real-time whole-body interaction , 2012, CHI.

[13]  James R. Williams,et al.  Guidelines for the Use of Multimedia in Instruction , 1998 .

[14]  Desney S. Tan,et al.  Skinput: appropriating the body as an input surface , 2010, CHI.

[15]  Moustafa Youssef,et al.  CoSDEO 2016 Keynote: A decade later — Challenges: Device-free passive localization for wireless environments , 2007, 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops).

[16]  Frédo Durand,et al.  The visual microphone , 2014, ACM Trans. Graph..

[17]  Eric C. Larson,et al.  HeatWave: thermal imaging for surface user interaction , 2011, CHI.

[18]  Rob Miller,et al.  3D Tracking via Body Radio Reflections , 2014, NSDI.

[19]  L. J. Chu Physical Limitations of Omni‐Directional Antennas , 1948 .

[20]  Yunhao Liu,et al.  From RSSI to CSI , 2013, ACM Comput. Surv..

[21]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[22]  Lu Wang,et al.  FIMD: Fine-grained Device-free Motion Detection , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[23]  Tsuhan Chen,et al.  Profile View Lip Reading , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[24]  David Wetherall,et al.  Predictable 802.11 packet delivery from wireless channel measurements , 2010, SIGCOMM '10.

[25]  G. Charvat,et al.  A Through-Dielectric Radar Imaging System , 2010, IEEE Transactions on Antennas and Propagation.

[26]  Phil D. Green,et al.  Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[27]  Alexander H. Waibel,et al.  Toward movement-invariant automatic lip-reading and speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[28]  Deng Cai,et al.  Unsupervised feature selection for multi-cluster data , 2010, KDD.

[29]  Tom Minka,et al.  You are facing the Mona Lisa: spot localization using PHY layer information , 2012, MobiSys '12.

[30]  Sneha Kumar Kasera,et al.  Advancing wireless link signatures for location distinction , 2008, MobiCom '08.

[31]  Shyamnath Gollakota,et al.  Bringing Gesture Recognition to All Devices , 2014, NSDI.

[32]  Jue Wang,et al.  Dude, where's my card?: RFID positioning that works with multipath and non-line of sight , 2013, SIGCOMM.

[33]  Alexander H. Waibel,et al.  See Me, Hear Me: Integrating Automatic Speech Recognition and Lip-reading , 1994 .

[34]  Theodore S. Rappaport,et al.  Wireless communications - principles and practice , 1996 .

[35]  Leslie S. Smith,et al.  Feature subset selection in large dimensionality domains , 2010, Pattern Recognit..

[36]  Sachin Katti,et al.  Full duplex backscatter , 2013, HotNets.

[37]  Xiang-Yang Li,et al.  You're driving and texting: detecting drivers using personal smart phones by leveraging inertial sensors , 2013, MobiCom.

[38]  Shwetak N. Patel,et al.  Whole-home gesture recognition using wireless signals , 2013, MobiCom.