A multi-frame blocking for signal segmentation in voice command recognition

Frame blocking segment the voice signals into overlapping frames to ensure information is maintained. Various sound features in the time domain can be extracted from the signal representation in the form of this frame. The frame blocking stage allows spectral distortion to appear at the beginning and the end of each frame, which gives rise to discontinuous signal pieces. It will appear as part of the voice feature if extracted directly in the time domain. In the frequency domain, these discontinuous signal pieces are removed using windowing techniques. However, the use of windowing techniques sometimes has an impact on changing signal information at the beginning and the end of the frame. This study proposes a multi-frame blocking method to optimize the frame blocking stage. The aim of the proposed method is to maintain the integrity of the signal information while minimizing the number of frames without losing information. In this study, the sample of the voice signal is a command signal used to control the basic movements of the robot. The absolute portion of PSD (Power Spectral Density) have used as a voice signal feature where the average values represented as grayscale images. The results of studies have shown an increase in the performance of the voice signal recognition stage by 18.29% when compared to using only conventional frame blocking methods.

[1]  Matús Pleva,et al.  Using HMD for Immersive Training of Voice-Based Operation of Small Unmanned Ground Vehicles , 2019, HCI.

[2]  Sunil Sharma,et al.  Speech Analysis and Feature Extraction using SCILAB , 2013 .

[3]  Sandeep Sharma,et al.  Speech Recognition System: A Review , 2015 .

[4]  Shaun V. Ault,et al.  On Speech Recognition Algorithms , 2018, International Journal of Machine Learning and Computing.

[5]  Rami Matarneh,et al.  Software for Voice Control Robot: Example of Implementation , 2017 .

[6]  Aishwarya Srivastava,et al.  An automatic classification of bird species using audio feature extraction and support vector machines , 2016, 2016 International Conference on Inventive Computation Technologies (ICICT).

[7]  Paavo Alku,et al.  Feature Extraction Using Power-Law Adjusted Linear Prediction With Application to Speaker Recognition Under Severe Vocal Effort Mismatch , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Saiful Bahri Mohamed,et al.  Speaker Independent Speech Recognition of Isolated Words in Room Environment , 2017 .

[9]  Junnan Li,et al.  Design and Realization of Intelligent Voice-Control Car Based on Raspberry Pi , 2018, Advances in Smart Vehicular Technology, Transportation, Communication and Applications.

[10]  Junta Zeniarja,et al.  Implementation of Neural Network Backpropagation Using Audio Feature Extraction For Classification Of Gamelan Notes , 2018, 2018 International Seminar on Application for Technology of Information and Communication.

[11]  Said E. El-Khamy,et al.  Chaos-based image hiding scheme between silent intervals of high quality audio signals using feature extraction and image bits spreading , 2018, 2018 35th National Radio Science Conference (NRSC).

[12]  G C Hansen,et al.  Voice recognition system. , 1988, Radiology.

[13]  Koji Makino,et al.  Positioning Control of a Micro Manipulation Robot Based on Voice Command Recognition for the Microscopic Cell Operation , 2017 .

[14]  Fakhri Karray,et al.  Voice Controlled Multi-robot System for Collaborative Task Achievement , 2017, RiTA.

[15]  O. Korniienko,et al.  Voice Activity Detection Algorithm Using Spectral-Correlation and Wavelet-Packet Transformation , 2018, Radioelectronics and Communications Systems.

[16]  W S M Sanjaya,et al.  The Implementation of Speech Recognition using Mel-Frequency Cepstrum Coefficients (MFCC) and Support Vector Machine (SVM) method based on Python to Control Robot Arm , 2018 .

[17]  Rudraswamy S B,et al.  A Computer-Based Application for Speech Recognition in Multi-Speaker Environment to Assist Hearing Impaired People , 2019, SSRN Electronic Journal.

[18]  S. Chakrasali,et al.  Comparative Analysis of Different Windowing Techniques in MFCC Speaker Recognition , 2014 .

[19]  İbrahim Yücedağ,et al.  Developing and modeling of voice control system for prosthetic robot arm in medical systems , 2017, J. King Saud Univ. Comput. Inf. Sci..

[20]  P R Bhole Voice Command Based Robotic Vehicle Control , 2017 .

[21]  Chockalingam Aravind Vaithilingam,et al.  A Hybrid Spoken Language Processing System for Smart Device Troubleshooting , 2019 .

[22]  Maria Spichkova,et al.  Voice-activated solutions for agile retrospective sessions , 2019, KES.

[23]  Suresh Gobee,et al.  Isolated Word Command Recognition for Robot Navigation , 2012 .