Targeting ultrasound-based gesture recognition, this paper proposes a new universal PRESS/HOLD/RELEASE approach that leverages the diversity of gestures performed on smart devices such as mobile phones and IoT nodes. The new set of gestures are generated by interleaving PRESS/HOLD/RELEASE patterns; abbreviated as P/H/R, with gestures like sweeps between a number of microphones. P/H/R patterns are constructed by a hand as it approaches a top of a microphone to generate a virtual Press. After that, the hand settles for an undefined period of time to generate a virtual Hold and finally departs to generate a virtual Release. The same hand can sweep to a 2nd microphone and perform another P/H/R. Interleaving the P/H/R patterns expands the number of performed gestures. Assuming an on-board speaker transmitting ultrasonic signals, the detection is performed on Doppler shift readings generated by a hand as it approaches and departs a top of a microphone. The Doppler shift readings are presented in a sequence of down-mixed ultrasonic spectrogram frames. We train a Temporal Convolutional Network (TCN) to classify the P/H/R patterns under different environmental noises. Our experimental results show that such P/H/R patterns at a top of a microphone can be achieved with 96.6% accuracy under different noise conditions. A group of P/H/R based gestures has been tested on commercially off-the-shelf (COTS) Samsung Galaxy S7 Edge. Different P/H/R interleaved gestures (such as sweeps, long taps, etc.) are designed using two microphones and a single speaker while using as low as $\sim 5\mathrm{K}$ parameters and as low as $\sim 0.15$ Million operations (MOPs) in compute power per inference. The P/H/R interleaved set of gestures are intuitive and hence are easy to learn by end users. This paves its way to be deployed by smartphones and smart speakers for mass production.
[1]
Lei Yang,et al.
AudioGest: enabling fine-grained hand gesture detection by decoding echo signal
,
2016,
UbiComp.
[2]
Min Li,et al.
In-Air Ultrasonic 3D-Touchscreen with Gesture Recognition Using Existing Hardware for Smart Devices
,
2016,
2016 IEEE International Workshop on Signal Processing Systems (SiPS).
[3]
Eric C. Larson,et al.
DopLink: using the doppler effect for multi-device interaction
,
2013,
UbiComp.
[4]
Evangelos Kalogerakis,et al.
RisQ: recognizing smoking gestures with inertial sensors on a wristband
,
2014,
MobiSys.
[5]
Haipeng Dai,et al.
UltraGesture: Fine-Grained Gesture Sensing and Recognition
,
2018,
2018 15th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).
[6]
Mengyun Liu,et al.
High precision gesture sensing via quantitative characterization of the Doppler effect
,
2016,
2016 23rd International Conference on Pattern Recognition (ICPR).
[7]
Trevor Darrell,et al.
Fully Convolutional Networks for Semantic Segmentation
,
2017,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[8]
Vladlen Koltun,et al.
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
,
2018,
ArXiv.
[9]
Jimmy Ba,et al.
Adam: A Method for Stochastic Optimization
,
2014,
ICLR.
[10]
Dina Katabi,et al.
RF-IDraw: virtual touch screen in the air using RF signals
,
2014,
S3 '14.
[11]
Tareq Y. Al-Naffouri,et al.
Angle-of-arrival-based gesture recognition using ultrasonic multi-frequency signals
,
2017,
2017 25th European Signal Processing Conference (EUSIPCO).
[12]
Desney S. Tan,et al.
SoundWave: using the doppler effect to sense gestures
,
2012,
CHI.