Word Error Rate Comparison between Single and Double Radar Solutions for Silent Speech Recognition

Silent speech recognition (SSR) is a technology that translates human speech into text without voice information. Various sensors, such as vision, electromyography, electromagnetic articulatory, and radar sensors, can be used to build an SSR system. Because the signals of radar sensors are less intuitive, radar-based SSR research is less common and remains at a basic level compared with work on other sensors. As a basic step in this research area, in this study, we attempted to determine whether single radar or double radar shows better performance for an SSR system. To this end, we estimated the word error rate (WER) of each system. The results showed that a double-radar-based SSR system produced better WER output. This means that the number of radar sensors used in SSR can potentially affect its performance. Therefore, when we create a radar-based SSR hardware platform, how many radar sensors would be ideal for its best performance must be considered.

[1]  Todd Walter,et al.  Availability Impact on GPS Aviation due to Strong Ionospheric Scintillation , 2011, IEEE Transactions on Aerospace and Electronic Systems.

[2]  Mohamed Morchid,et al.  Impact of Word Error Rate on theme identification task of highly imperfect human-human conversations , 2016, Comput. Speech Lang..

[3]  Xiao Liu,et al.  Vision-based parking-slot detection: A benchmark and a learning-based approach , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[4]  Jiwon Seo,et al.  Autonomous safe landing-area determination for rotorcraft UAVs using multiple IR-UWB radars , 2017 .

[5]  Jiwon Seo,et al.  Prediction of Human Trajectory Following a Haptic Robotic Guide Using Recurrent Neural Networks , 2019, 2019 IEEE World Haptics Conference (WHC).

[6]  Yi Yang,et al.  Lane Detection and Classification for Forward Collision Warning System Based on Stereo Vision , 2018, IEEE Sensors Journal.

[7]  Karly A. Smith,et al.  Gesture Recognition Using mm-Wave Sensor for Human-Car Interface , 2018, IEEE Sensors Letters.

[8]  A.M. Eid,et al.  Ultrawideband Speech Sensing , 2009, IEEE Antennas and Wireless Propagation Letters.

[9]  Myung Jong Kim,et al.  Multiview Representation Learning via Deep CCA for Silent Speech Recognition , 2017, INTERSPEECH.

[10]  Jiwon Seo,et al.  Observation of Human Response to a Robotic Guide Using a Variational Autoencoder , 2019, 2019 Third IEEE International Conference on Robotic Computing (IRC).

[11]  Sung Ho Cho,et al.  Vital Sign Monitoring and Mobile Phone Usage Detection Using IR-UWB Radar for Intended Use in Car Crash Prevention , 2017, Sensors.

[12]  Tanja Schultz,et al.  Array-based Electromyographic Silent Speech Interface , 2013, BIOSIGNALS.

[13]  James T. Heaton,et al.  Signal acquisition and processing techniques for sEMG based silent speech recognition , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[14]  Jiwon Seo,et al.  IR-UWB radar-based near-field head rotation movement sensing under fixed body motions , 2018, 2018 International Conference on Electronics, Information, and Communication (ICEIC).

[15]  Jiwon Seo,et al.  Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar , 2016, Sensors.

[16]  Jiwon Seo,et al.  Low-Cost Curb Detection and Localization System Using Multiple Ultrasonic Sensors , 2019, Sensors.

[17]  Jiyun Lee,et al.  Monitoring and Mitigation of Ionospheric Anomalies for GNSS-Based Safety Critical Systems: A review of up-to-date signal processing techniques , 2017, IEEE Signal Processing Magazine.

[18]  David Gelbart,et al.  Improving word accuracy with Gabor feature extraction , 2002, INTERSPEECH.

[19]  Jongsoo Baek,et al.  Effect of Redundant Haptic Information on Task Performance during Visuo-Tactile Task Interruption and Recovery , 2016, Front. Psychol..

[20]  Chun-Hsu Ko,et al.  Motion Guidance for a Passive Robot Walking Helper via User's Applied Hand Forces , 2016, IEEE Transactions on Human-Machine Systems.

[21]  Jiwon Seo,et al.  IR-UWB radar based near-field intentional eyelid movement sensing under fixed head and body motions , 2017, 2017 17th International Conference on Control, Automation and Systems (ICCAS).

[22]  Shmuel Peleg,et al.  Improved Speech Reconstruction from Silent Video , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[23]  Gérard Bailly,et al.  Continuous Articulatory-to-Acoustic Mapping using Phone-based Trajectory HMM for a Silent Speech Interface , 2012, INTERSPEECH.

[24]  Todd Walter,et al.  Future Dual-Frequency GPS Navigation System for Intelligent Air Transportation Under Strong Ionospheric Scintillation , 2014, IEEE Transactions on Intelligent Transportation Systems.

[25]  Myungjong Kim,et al.  Speaker-Independent Silent Speech Recognition From Flesh-Point Articulatory Movements Using an LSTM Neural Network , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[26]  Steve Young,et al.  The HTK book version 3.4 , 2006 .

[27]  J. M. Gilbert,et al.  Silent speech interfaces , 2010, Speech Commun..

[28]  Lyuba Alboul,et al.  Following a Robot using a Haptic Interface without Visual Feedback , 2014, ACHI 2014.