Deep Learning Assisted Time-Frequency Processing for Speech Enhancement on Drones

[1]  Tara N. Sainath,et al.  Deep Learning for Audio Signal Processing , 2019, IEEE Journal of Selected Topics in Signal Processing.

[2]  DeLiang Wang,et al.  Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3]  Jinwon Lee,et al.  A Fully Convolutional Neural Network for Speech Enhancement , 2016, INTERSPEECH.

[4]  DeLiang Wang,et al.  Towards Scaling Up Classification-Based Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Carlo Drioli,et al.  Beamforming-Based Acoustic Source Localization and Enhancement for Multirotor UAVs , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[6]  Björn W. Schuller,et al.  Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.

[7]  Andrea Cavallaro,et al.  Acoustic Sensing From a Multi-Rotor Drone , 2018, IEEE Sensors Journal.

[8]  Yi Hu,et al.  Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Franz Pernkopf,et al.  DNN-based speech mask estimation for eigenvector beamforming , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Dario Floreano,et al.  Robust acoustic source localization of emergency signals from Micro Air Vehicles , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Mark D. Plumbley,et al.  Single channel audio source separation using convolutional denoising autoencoders , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[12]  Quoc V. Le,et al.  Recurrent Neural Networks for Noise Reduction in Robust ASR , 2012, INTERSPEECH.

[13]  Hiroshi G. Okuno,et al.  Design of UAV-Embedded Microphone Array System for Sound Source Localization in Outdoor Environments † , 2017, Sensors.

[14]  Jun Du,et al.  Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement , 2017, INTERSPEECH.

[15]  Li-Rong Dai,et al.  A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[16]  Keisuke Nakamura,et al.  Outdoor auditory scene analysis using a moving microphone array embedded in a quadrocopter , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Joshua D. Reiss,et al.  An Iterative Approach to Source Counting and Localization Using Two Distant Microphones , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[18]  John R. Hershey,et al.  Deep long short-term memory adaptive beamforming networks for multichannel robust speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Björn Schuller,et al.  Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments , 2017 .

[20]  Richard C. Hendriks,et al.  Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  Andrea Cavallaro,et al.  Microphone-Array Ego-Noise Reduction Algorithms for Auditory Micro Aerial Vehicles , 2017, IEEE Sensors Journal.

[22]  Prasant Misra,et al.  Aerial Drones with Location-Sensitive Ears , 2018, IEEE Communications Magazine.

[23]  Lin Wang,et al.  Noise Power Spectral Density Estimation Using MaxNSR Blocking Matrix , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[24]  Sunggeun Yoo,et al.  Advanced sound capturing method with adaptive noise reduction system for broadcasting multicopters , 2015, 2015 IEEE International Conference on Consumer Electronics (ICCE).

[25]  Vincenzo Lippiello,et al.  Attentional multimodal interface for multidrone search in the Alps , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[26]  Brian R. Mace,et al.  Improving Power Spectral Density Estimation of Unmanned Aerial Vehicle Rotor Noise by Learning from Non-Acoustic Information , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).

[27]  Alper Bozkurt,et al.  Sound Localization Sensors for Search and Rescue Biobots , 2016, IEEE Sensors Journal.

[28]  Tomohiro Nakatani,et al.  Integrating DNN-based and spatial clustering-based mask estimation for robust MVDR beamforming , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  DeLiang Wang,et al.  Ideal ratio mask estimation using deep neural networks for robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  Peilin Liu,et al.  Robust Beamforming for Speech Recognition Using DNN-Based Time-Frequency Masks Estimation , 2018, IEEE Access.

[31]  Antoine Deleforge,et al.  DREGON: Dataset and Methods for UAV-Embedded Sound Source Localization , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[32]  Andrea Cavallaro,et al.  Ear in the sky: Ego-noise reduction for auditory micro aerial vehicles , 2016, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[33]  Dario Floreano,et al.  On-Board Relative Bearing Estimation for Teams of Drones Using Sound , 2016, IEEE Robotics and Automation Letters.

[34]  Liang Lu,et al.  Deep beamforming networks for multi-channel speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[35]  Reinhold Häb-Umbach,et al.  Neural network based spectral mask estimation for acoustic beamforming , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[36]  Yu Tsao,et al.  Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.

[37]  François Chollet,et al.  Keras: The Python Deep Learning library , 2018 .

[38]  Andrea Cavallaro,et al.  Tracking a moving sound source from a multi-rotor drone , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[39]  DeLiang Wang,et al.  Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[40]  Jun Du,et al.  Multiple-target deep learning for LSTM-RNN based speech enhancement , 2017, 2017 Hands-free Speech Communications and Microphone Arrays (HSCMA).

[41]  Keisuke Nakamura,et al.  Improvement in outdoor sound source detection using a quadrotor-embedded microphone array , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Yu Tsao,et al.  Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks , 2017, IEEE Transactions on Emerging Topics in Computational Intelligence.

[43]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[44]  Yusuke Hioka,et al.  Speech enhancement using a microphone array mounted on an unmanned aerial vehicle , 2016, 2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC).

[45]  Andrea Cavallaro,et al.  Time-frequency processing for sound source localization from a micro aerial vehicle , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[46]  GerkmannTimo,et al.  Noise power spectral density estimation using MaxNSR blocking matrix , 2015 .

[47]  Jonathan Le Roux,et al.  Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks , 2016, INTERSPEECH.

[48]  Katsutoshi Itoyama,et al.  Noise correlation matrix estimation for improving sound source localization by multirotor UAV , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[49]  Slim Essid,et al.  DNN-based Distributed Multichannel Mask Estimation for Speech Enhancement in Microphone Arrays , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[50]  Mark Hasegawa-Johnson,et al.  Speech Enhancement Using Bayesian Wavenet , 2017, INTERSPEECH.

[51]  DeLiang Wang,et al.  On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[52]  Marc Moonen,et al.  GSVD-based optimal filtering for single and multimicrophone speech enhancement , 2002, IEEE Trans. Signal Process..

[53]  Sunggeun Yoo,et al.  Two-stage adaptive noise reduction system for broadcasting multicopters , 2016, 2016 IEEE International Conference on Consumer Electronics (ICCE).

[54]  L. Marino,et al.  Experimental analysis on the noise of propellers for small UAV , 2013 .

[55]  Andrea Cavallaro,et al.  Audio-visual sensing from a quadcopter: dataset and baselines for source localization and sound enhancement , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[56]  Andrea Cavallaro,et al.  Multi-Modal Localization and Enhancement of Multiple Sound Sources from a Micro Aerial Vehicle , 2017, ACM Multimedia.