Information Security for Automatic Speaker Identification

Speaker identification is a widely used technique in several security systems. In remote access systems, speaker utterances are recoded and communicated through a communication channel to a receiver that performs the identification process. Speaker identification is based on characterizing each speaker with a set of features extracted from his or her utterance. Extracting the features from a clean speech signal guarantees the high success rate in the identification process. In real cases, a clean speech is not available for feature extraction due to channel degradations, background noise, or interfering audio signals. As a result, there is a need for speech enhancement, deconvolution, and separation algorithms to solve the problem of speaker identification in the presence of impairments. Another important issue, which deserves consideration, is how to enhance the security of a speaker identification system. This can be accomplished by watermark embedding in the clean speech signals at the transmitter. If this watermark is extracted correctly at the receiver, it can be used to ensure the correct speaker identification. Another means of security enhancement is the encryption of speech at the transmitter. Speech encryption prevents eavesdroppers from getting the speech signals that will be used for feature extraction to avoid any unauthorized access to the system by synthesis trials. Multilevels of security can be achieved by implementing both watermarking and encryption at the transmitter. The watermarking and encryption algorithms need to be robust to speech enhancement, and deconvolution algorithms to achieve the required degree of security and the highest possible speaker identification rates. This book provides for the first time a comprehensive literature review on how to improve the performance of speaker identification systems in noisy environments, by combining different feature extraction techniques with speech enhancement, deconvolution, separation, watermarking, and/or encryption.

[1]  Alexander I. Galushkin,et al.  Neural Networks Theory , 2007 .

[2]  Jiwu Huang,et al.  Histogram-Based Audio Watermarking Against Time-Scale Modification and Cropping Attacks , 2007, IEEE Transactions on Multimedia.

[3]  Chung Jung Kuo,et al.  Novel image encryption technique and its application in progressive transmission , 1993, J. Electronic Imaging.

[4]  Fathi E. Abd El-Samie,et al.  Encryption of speech signal with multiple secret keys in time and transform domains , 2010, Int. J. Speech Technol..

[5]  Mohamed M. E. El-Halawany,et al.  Blind separation of audio signals using trigonometric transforms and wavelet denoising , 2010, Int. J. Speech Technol..

[6]  James S. Walker,et al.  A Primer on Wavelets and Their Scientific Applications , 1999 .

[7]  Moawad I. Dessouky,et al.  Neural FET small-signal modelling based on mel-frequency cepstral coefficients , 2009, 2009 International Conference on Computer Engineering & Systems.

[8]  B. R. Hunt,et al.  Digital Image Restoration , 1977 .

[9]  Gérard Dreyfus,et al.  Neural networks - methodology and applications , 2005 .

[10]  Moawad I. Dessouky,et al.  Regularized super-resolution reconstruction of images using wavelet fusion , 2005 .

[11]  Moawad I. Dessouky,et al.  A new approach for small-signal modelling of the field effect transistor based on cepstral coefficients and discrete transforms , 2011 .

[12]  José Lara A Method of Automatic Speaker Recognition Using Cepstral Features and Vectorial Quantization , 2005, CIARP.

[13]  B. M. Sallam,et al.  Fingerprint recognition using mel-frequency cepstral coefficients , 2010, Pattern Recognition and Image Analysis.

[14]  Claude E. Shannon,et al.  Communication theory of secrecy systems , 1949, Bell Syst. Tech. J..

[15]  M. Hadhoud,et al.  Sectioned implementation of regularized image interpolation , 2003, 2003 46th Midwest Symposium on Circuits and Systems.

[16]  Fathi E. Abd El-Samie,et al.  Detection of Landmines from Acoustic Images Based on Cepstral Coefficients , 2009 .

[17]  Feng Huang,et al.  A Novel Symmetric Image Encryption Approach Based on a New Invertible Two-Dimensional Map , 2008, 2008 International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[18]  S.E. El-Khamy,et al.  Optimization of image interpolation as an inverse problem using the LMMSE algorithm , 2004, Proceedings of the 12th IEEE Mediterranean Electrotechnical Conference (IEEE Cat. No.04CH37521).

[19]  Xiangyang Wang,et al.  A New Adaptive Digital Audio Watermarking Based on Support Vector Regression , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  S.R. Mahadeva Prasanna,et al.  Enhancement of Noisy Speech by Spectral Subtraction and Residual Modification , 2006, 2006 Annual IEEE India Conference.

[21]  Aleksandra Pizurica,et al.  Image denoising using wavelets and spatial context modeling , 2002 .

[22]  S. El-Rabaie,et al.  Fet Small-Signal Modeling Using Mel-Frequency Cepstral Coefficients and the Discrete Cosine Transform , 2010, J. Circuits Syst. Comput..

[23]  W. C. Chu,et al.  DCT-based image watermarking using subsampling , 2003, IEEE Trans. Multim..

[24]  Fathi E. Abd El-Samie,et al.  An efficient singular value decomposition algorithm for digital audio watermarking , 2009, Int. J. Speech Technol..

[25]  Edward J. Delp,et al.  Benchmarking of image watermarking algorithms for digital rights management , 2004, Proceedings of the IEEE.

[26]  M. Hossain,et al.  A real time speaker identification using artificial neural network , 2007, 2007 10th international conference on computer and information technology.

[27]  Zheng Liu,et al.  Audio watermarking techniques using sinusoidal patterns based on pseudorandom sequences , 2003, IEEE Trans. Circuits Syst. Video Technol..

[28]  Michael Unser,et al.  A review of wavelets in biomedical applications , 1996, Proc. IEEE.

[29]  A. Prochazka,et al.  Signal Analysis and Prediction , 1998 .

[30]  Fathi E. Abd El-Samie,et al.  A Wavelet Based Approach for Speaker Identification from Degraded Speech , 2009, Int. J. Commun. Networks Inf. Secur..

[31]  S.E. El-Khamy,et al.  A new technique for enhanced regularized image restoration , 2002, Proceedings of the Nineteenth National Radio Science Conference.

[32]  V. Chandrasekaran,et al.  Integrated Confusion-Diffusion Mechanisms for Chaos Based Image Encryption , 2008, 2008 IEEE 8th International Conference on Computer and Information Technology Workshops.

[33]  M. Wickerhauser,et al.  Wavelets and time-frequency analysis , 1996, Proc. IEEE.

[34]  Douglas D. O'Shaughnessy,et al.  Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition , 1999, IEEE Trans. Speech Audio Process..

[35]  Lahouari Ghouti,et al.  Digital image watermarking using balanced multiwavelets , 2006, IEEE Transactions on Signal Processing.

[36]  Andrew Sekey,et al.  An Objective Measure for Predicting Subjective Quality of Speech Coders , 1992, IEEE J. Sel. Areas Commun..

[37]  Said Esmail El-Khamy,et al.  Enhanced Wiener Restoration of Images Based on the Haar Wavelet Transform , 2005, Int. J. Inf. Acquis..

[38]  S. El-Rabaie,et al.  Homomorphic image encryption , 2009, J. Electronic Imaging.

[39]  Marie Farge,et al.  Wavelets and turbulence , 2012, Proc. IEEE.

[40]  M. Vetterli,et al.  Wavelets, subband coding, and best bases , 1996, Proc. IEEE.

[41]  Fayyaz A. Afsar,et al.  Wavelet transform based automatic speaker recognition , 2009, 2009 IEEE 13th International Multitopic Conference.

[42]  Ingrid Daubechies,et al.  Where do wavelets come from? A personal point of view , 1996, Proc. IEEE.

[43]  Sadaoki Furui,et al.  Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMM's , 1994, IEEE Trans. Speech Audio Process..

[44]  J. N. Gowdy,et al.  Feature extraction using discrete wavelet transform for speech recognition , 2000, Proceedings of the IEEE SoutheastCon 2000. 'Preparing for The New Millennium' (Cat. No.00CH37105).

[45]  Said Esmail El-Khamy,et al.  A Chaotic Interleaving Scheme for the Continuous Phase Modulation Based Single-Carrier Frequency-Domain Equalization System , 2012, Wirel. Pers. Commun..

[46]  S. M. Elaraby,et al.  Welding defect detection from radiography images with a cepstral approach , 2011 .

[47]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[48]  Fathi E. Abd El-Samie,et al.  Detection of landmines and underground utilities from acoustic and GPR images with a cepstral approach , 2010, J. Vis. Commun. Image Represent..

[49]  S.E. El-Khamy,et al.  New trends in high resolution image processing , 2004, The Fourth Workshop on Photonics and Its Application, 2004..

[50]  L. Rabiner,et al.  An interpretation of the log likelihood ratio as a measure of waveform coder performance , 1980 .

[51]  Sadaoki Furui,et al.  Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMMs , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[52]  Robert E. Yantorno,et al.  Performance of the modified Bark spectral distortion as an objective speech quality measure , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[53]  Der-Chyuan Lou,et al.  A semi-blind digital watermarking scheme based on singular value decomposition , 2006, Comput. Stand. Interfaces.

[54]  Zengqiang Chen,et al.  Video Compression and Encryption Based-On Multiple Chaotic System , 2008, 2008 3rd International Conference on Innovative Computing Information and Control.

[55]  M. A. A. El-Fattah,et al.  Speech Enhancement Using an Adaptive Wiener Filtering Approach , 2008 .

[56]  R. Kubichek,et al.  Mel-cepstral distance measure for objective speech quality assessment , 1993, Proceedings of IEEE Pacific Rim Conference on Communications Computers and Signal Processing.

[57]  A. Cohen,et al.  Wavelets: the mathematical background , 1996, Proc. IEEE.

[58]  S. Krishnan,et al.  A Robust Audio Watermark Representation Based on Linear Chirps , 2006, IEEE Transactions on Multimedia.

[59]  D.P. Skinner,et al.  The cepstrum: A guide to processing , 1977, Proceedings of the IEEE.

[60]  Oliver Chiu-sing Choy,et al.  An efficient MFCC extraction method in speech recognition , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[61]  Goutam Saha,et al.  A Comparative Study of Feature Extraction Algorithms on ANN Based Speaker Model for Speaker Recognition Applications , 2004, ICONIP.

[62]  Osama M. Abu Zaid,et al.  Quality of Encryption Measurement of Bitmap Images with RC6, MRC6, and Rijndael Block Cipher Algorithms , 2007, Int. J. Netw. Secur..

[63]  Douglas A. Reynolds,et al.  An overview of automatic speaker recognition technology , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[64]  Tieniu Tan,et al.  An SVD-based watermarking scheme for protecting rightful ownership , 2002, IEEE Trans. Multim..

[65]  Ahmet M. Eskicioglu,et al.  Robust DWT-SVD domain image watermarking: embedding data in all frequencies , 2004, MM&Sec '04.

[66]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[67]  Xinghuo Yu,et al.  Improved Baker map for image encryption , 2006, 2006 1st International Symposium on Systems and Control in Aerospace and Astronautics.

[68]  José Antonio Apolinário,et al.  Speech privacy for modern mobile communication systems , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[69]  Xiangyang Xue,et al.  Localized audio watermarking technique robust against time-scale modification , 2006, IEEE Trans. Multim..

[70]  A. Jain Fast inversion of banded Toeplitz matrices by circular decompositions , 1978 .

[71]  P. Babu Anto,et al.  Speech Recognition of Isolated Malayalam Words Using Wavelet Features and Artificial Neural Network , 2008, 4th IEEE International Symposium on Electronic Design, Test and Applications (delta 2008).

[72]  M. Hadhoud,et al.  An Efficient Block-by-Block SVD-Based Image Watermarking Scheme , 2007, 2007 National Radio Science Conference.

[73]  Werner Oomen,et al.  A temporal domain audio watermarking technique , 2003, IEEE Trans. Signal Process..

[74]  W. Sweldens Wavelets: What next? , 1996, Proc. IEEE.

[75]  Said Esmail El-Khamy,et al.  Wavelet Fusion: a Tool to Break the Limits on LMMSE Image Super-Resolution , 2006, Int. J. Wavelets Multiresolution Inf. Process..

[76]  Sheng-He Sun,et al.  Multipurpose image watermarking algorithm based on multistage vector quantization , 2005, IEEE Transactions on Image Processing.

[77]  Moawad I. Dessouky,et al.  FET SMALL-SIGNAL MODELLING BASED ON THE DST AND MEL FREQUENCY CEPSTRAL COEFFICIENTS , 2009 .

[78]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[79]  J. Fridrich Image encryption based on chaotic maps , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[80]  S.Y. Foo,et al.  Wavelet Processing for Pitch Period Estimation , 2006, 2006 Proceeding of the Thirty-Eighth Southeastern Symposium on System Theory.

[81]  Heung-Kyu Lee,et al.  Invariant image watermark using Zernike moments , 2003, IEEE Trans. Circuits Syst. Video Technol..

[82]  Ed Dawson,et al.  Design and Cryptanalysis of Transform-Based Analog Speech Scamblers , 1993, IEEE J. Sel. Areas Commun..

[83]  Fathi E. Abd El-Samie,et al.  An SVD audio watermarking approach using chaotic encrypted images , 2011, Digit. Signal Process..

[84]  Nicholas W. D. Evans,et al.  An Assessment on the Fundamental Limitations of Spectral Subtraction , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[85]  I. Nakajima,et al.  Medical Image Encryption Based on Pixel Arrangement and Random Permutation for Transmission Security , 2007, 2007 9th International Conference on e-Health Networking, Application and Services.