Remote Voice Acquisition in Multimodal Surveillance

Multimodal surveillance systems using visible/IR cameras and other sensors are widely deployed today for security purpose, particularly when subjects are at a large distance. However, audio information as an important data source has not been well explored. One of the reasons is because audio detection using microphones needs installation close to the subjects in monitoring. In this paper, we investigate a novel "optical" sensor, called laser Doppler vibrometer (LDV), for capturing voice signals in a very large range to realize a truly remote and multimodal surveillance system. Speech enhancement approaches are studied based on the characteristics of LDV audio. Experimental results show that remote voice detection via an LDV is promising when choosing appropriate targets close to human subjects in the environment

[1]  J. Dainty Laser speckle and related phenomena , 1975 .

[2]  Steve Rothberg,et al.  Vibration measurements using continuous scanning laser vibrometry: velocity sensitivity model experimental validation , 2003 .

[3]  Yi Hu,et al.  A subspace approach for enhancing speech corrupted by colored noise , 2002, IEEE Signal Processing Letters.

[4]  Israel Cohen,et al.  On speech enhancement under signal presence uncertainty , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Larry S. Davis,et al.  Multimodal tracking for smart videoconferencing , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[6]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[7]  B. Frieden,et al.  Laser speckle and related phenomena , 1984, IEEE Journal of Quantum Electronics.

[8]  James M. Sabatier,et al.  Forward-looking acoustic mine detection system , 2001, SPIE Defense + Commercial Sensing.