Integrating LDV Audio and IR Video for Remote Multimodal Surveillance

This paper describes a multimodal surveillance system for human signature detection. The system consists of three types of sensors: infrared (IR) cameras, pan/tilt/zoom (PTZ) color cameras and laser Doppler vibrometers (LDVs). The LDV is explored as a new non-contact remote voice detector. We have found that voice energy vibrates most objects and the vibrations can be detected by an LDV. Since signals captured by the LDV are very noisy, we have designed algorithms with Gaussian bandpass filtering and adaptive volume scaling to enhance the LDV voice signals. The enhanced voice signals are intelligible from targets without retro-reflective finishes at short or medium distances (<100m). By using retroreflective tapes, the distance could be as far as 300 meters. However, the manual operation to search and focus the laser beam on a target with both vibration and reflection is very difficult at medium and large distances. Therefore, infrared (IR) imaging for target selection and localization is also discussed. Future work remains in automatic LDV targeting and intelligent refocusing for long range LDV listening.

[1]  James M. Sabatier,et al.  Forward-looking acoustic mine detection system , 2001, SPIE Defense + Commercial Sensing.

[2]  Yi Hu,et al.  A subspace approach for enhancing speech corrupted by colored noise , 2002, IEEE Signal Processing Letters.

[3]  Steve Rothberg,et al.  Vibration measurements using continuous scanning laser vibrometry: velocity sensitivity model experimental validation , 2003 .

[4]  Riad I. Hammoud,et al.  Joint IEEE Workshop on Object Tracking and Classification Beyond the Visible Spectrum , 2004, CVPR Workshops.

[5]  Y. Ephraim,et al.  A Brief Survey of Speech Enhancement , 2003 .

[6]  Y. Ephraim,et al.  A Brief Survey of Speech Enhancement 1 , 2018, Microelectronics.

[7]  Rolf Vetter,et al.  Single channel speech enhancement using MDL-based subspace approach in Bark domain , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[8]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[9]  Israel Cohen,et al.  On speech enhancement under signal presence uncertainty , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[10]  J. Dainty Laser speckle and related phenomena , 1975 .

[11]  B. Frieden,et al.  Laser speckle and related phenomena , 1984, IEEE Journal of Quantum Electronics.