Efficient remote access system based on decoded and decompressed speech signals

This paper investigates the effect of both decoding and decompression on the Speaker Identification (SI) in a remote access system. The coding and compression processes are used for the communication purpose as a normal action taken for voice communication over Internet or mobile networks. In the proposed system, the speech signal is coded with the Linear Predictive Coding (LPC) technique. Also, the speech signal is compressed using two techniques. The first technique depends on decimation process to compress the signal. The signal can be recovered using inverse solutions. The inverse solutions include maximum entropy and regularized reconstruction. The second technique is the Compressive Sensing (CS) and the speech signal can be reconstructed using linear programming. The coded or compressed speech signal is transmitted into the receiver via a wireless communication channel. At the receiver, the received signal is decoded or decompressed, and then SI is performed on the decoded or decompressed speech signal. The performance of coding and compression techniques is evaluated using some metrics such as Perceptual Evaluation of Speech Quality (PESQ) and Dynamic Time Warping (DTW). The objective of SI is to achieve the security needed for the remote access system, and this security can be increased using coding and compression processes. In the SI system, the feature vectors are captured from different discrete transforms such as Discrete Wavelet Transform (DWT), Discrete Cosine Transform (DCT), and Discrete Sine Transform (DST), besides the time domain. The recognition rate for all transforms is computed to evaluate the performance of the SI system.

[1]  Michael W. Spratling A review of predictive coding algorithms , 2017, Brain and Cognition.

[2]  Joseph P. Olive,et al.  Text-to-speech synthesis , 1995, AT&T Technical Journal.

[3]  P. P. Vaidyanathan,et al.  The Theory of Linear Prediction , 2008, Synthesis Lectures on Signal Processing.

[4]  Zheng Qin,et al.  Data embedding in digital images using critical functions , 2017, Signal Process. Image Commun..

[5]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Justin K. Romberg,et al.  Compressive sampling of correlated signals , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[7]  Gerald Kpangkpari,et al.  The Use of Remote Access Tools by System Administrators Today and their Effectiveness: Case Study of Remote Desktop, Virtual Network Computing and Secure Android App , 2016 .

[8]  Moawad I. Dessouky,et al.  Regularized super-resolution reconstruction of images using wavelet fusion , 2005 .

[9]  Nikolas P. Galatsanos,et al.  Digital restoration of multichannel images , 1989, IEEE Trans. Acoust. Speech Signal Process..

[10]  Sabir Ahmed,et al.  Compressive Sensing for Speech Signals in Mobile Systems , 2012 .

[11]  S. Kabanikhin,et al.  Theory and numerical methods for solving inverse and ill-posed problems , 2019, Journal of Inverse and Ill-posed Problems.

[12]  Bin Li,et al.  A New Payload Partition Strategy in Color Image Steganography , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Xun Huan,et al.  Compressive sensing adaptation for polynomial chaos expansions , 2018, J. Comput. Phys..

[14]  Massimo Fornasier,et al.  Compressive Sensing , 2015, Handbook of Mathematical Methods in Imaging.

[15]  D. O'Shaughnessy,et al.  Linear predictive coding , 1988, IEEE Potentials.

[16]  Fathi E. Abd El-Samie,et al.  Information Security for Automatic Speaker Identification , 2011 .

[17]  Alfred O. Hero,et al.  Signal Processing Education [President's Message] , 2007 .

[18]  Sen M. Kuo,et al.  Speech‐Coding Techniques , 2006 .

[19]  J. Paik,et al.  Regularized Interative Image Interpolation and its application to Spatially Scalable Coding , 1998, International 1998 Conference on Consumer Electronics.

[20]  David Dorran,et al.  Linear Prediction: The Problem, its Solution and Application to Speech , 2008 .

[21]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[22]  G. Bachur,et al.  1 Separation of Voiced and Unvoiced using Zero crossing rate and Energy of the Speech Signal , 2008 .

[23]  Harry C. Andrews,et al.  Digital image restauration , 1977 .

[24]  Masaaki Honda,et al.  Human Speech Production Mechanisms , 2003 .

[25]  Ruili Wang,et al.  Speaker identification features extraction methods: A systematic review , 2017, Expert Syst. Appl..

[26]  Narseo Vallina-Rodriguez,et al.  An Analysis of the Privacy and Security Risks of Android VPN Permission-enabled Apps , 2016, Internet Measurement Conference.

[27]  Wai C. Chu Linear Prediction Coding , 2004 .

[28]  Siddhi Desai,et al.  Compressive Sensing in Speech Processing: A Survey Based on Sparsity and Sensing Matrix , 2013 .