Speech Privacy for Sound Surveillance Using Super-Resolution Based on Maximum Likelihood and Bayesian Linear Regression

Surveillance with multiple cameras and microphones is promising to trace activities of suspicious persons for security purposes. When these sensors are connected to the Internet, they might also jeopardize innocent people’s privacy because, as a result of human error, signals from sensors might allow eavesdropping by malicious persons. This paper presents a proposal for exploiting super-resolution to address this problem. Super-resolution is a signal processing technique by which a highresolution version of a signal can be reproduced from a low-resolution version of the same signal source. Because of this property, an intelligible speech signal is reconstructed from multiple sensor signals, each of which is completely unintelligible because of its sufficiently low sampling rate. A method based on Bayesian linear regression is proposed in comparison with one based on maximum likelihood. Computer simulations using a simple sinusoidal input demonstrate that the methods restore the original signal from those which are actually measured. Moreover, results show that the method based on Bayesian linear regression is more robust than maximum likelihood under various microphone configurations in noisy environments and that this advantage is remarkable when the number of microphones enrolled in the process is as small as the minimum required. Finally, listening tests using speech signals confirmed that mean opinion score (MOS) of the reconstructed signal reach 3, while those of the original signal captured at each single microphone are almost 1. key words: sensor network, sound surveillance, maximum likelihood, Bayesian linear regression, Mean Opinion Score

[1]  Touradj Ebrahimi,et al.  Smart video surveillance system preserving privacy , 2005, Electronic Imaging: Image and Video Communications and Processing.

[2]  Minghua Chen,et al.  Hiding privacy information in video surveillance system , 2005, IEEE International Conference on Image Processing 2005.

[3]  Christopher Slobogin,et al.  Public Privacy: Camera Surveillance of Public Places AndThe Right to Anonymity , 2003 .

[4]  Kevin W. Bowyer,et al.  Face recognition technology: security versus privacy , 2004, IEEE Technology and Society Magazine.

[5]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[6]  Hirokazu Kameoka,et al.  Computational auditory induction by missing-data non-negative matrix factorization , 2008, SAPA@INTERSPEECH.

[7]  Noboru Babaguchi,et al.  Privacy protecting visual processing for secure video surveillance , 2008, 2008 15th IEEE International Conference on Image Processing.

[8]  Alan V. Oppenheim,et al.  Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[9]  K. Srinathan,et al.  Efficient privacy preserving video surveillance , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[10]  A. Estimator Speech Enhancement Using a- Minimum Mean- Square Error Short-Time Spectral , 1984 .

[11]  Hong Wang,et al.  Coherent signal-subspace processing for the detection and estimation of angles of arrival of multiple wide-band sources , 1985, IEEE Trans. Acoust. Speech Signal Process..

[12]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[13]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[14]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[15]  Lin-shan Lee A speech security system not requiring synchronization , 1985, IEEE Communications Magazine.

[16]  S. Gazor,et al.  Speech probability distribution , 2003, IEEE Signal Processing Letters.

[17]  Jonathon A. Chambers,et al.  Audio super-resolution using analysis dictionary learning , 2015, 2015 IEEE International Conference on Digital Signal Processing (DSP).

[18]  Moon Gi Kang,et al.  Super-resolution image reconstruction: a technical overview , 2003, IEEE Signal Process. Mag..

[19]  K. J. Ray Liu,et al.  Super-resolution of musical signals using approximate matching pursuit , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[20]  Touradj Ebrahimi,et al.  Scrambling for Privacy Protection in Video Surveillance Systems , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[21]  J. S. Bradley,et al.  Measures for assessing architectural speech security (privacy) of closed offices and meeting rooms. , 2004, The Journal of the Acoustical Society of America.