论文信息 - Recent Advances in Clock Synchronization for Packet-Switched Networks

Recent Advances in Clock Synchronization for Packet-Switched Networks

Speech enhancement is a core problem in audio signal processing with commercial applications in devices as diverse as mobile phones, conference call systems, smart assistants, and hearing aids. An essential component in the design of speech enhancement algorithms is acoustic source localization. Speaker localization is also directly applicable to many other audio related tasks, e.g., automated camera steering, teleconferencing systems, and robot audition. From a signal processing perspective, speaker localization is the task of mapping multichannel speech signals to 3-D source coordinates. To obtain viable solutions for this mapping, an accurate description of the source wave propagation captured by the respective acoustic channel is required. In fact, the acoustic channels can be considered as the spatial fingerprints characterizing the positions of each of the sources in a reverberant enclosure. These fingerprints represent complex reflection patterns stemming from the surfaces and objects characterizing the enclosure. Hence, they are Bracha Laufer-Goldshtein, Ronen Talmon and Sharon Gannot (2020), “Data-Driven Multi-Microphone Speaker Localization on Manifolds”, Foundations and Trends © in Signal Processing: Vol. 14, No. 1–2, pp 1–161. DOI: 10.1561/2000000098. Full text available at: http://dx.doi.org/10.1561/2000000098

Rick S. Blum | Anantha K. Karthik

[1] Walter Kellermann,et al. EB-ESPRIT: 2D localization of multiple wideband acoustic sources using eigen-beams , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[2] Emmanuel Vincent,et al. CRNN-Based Multiple DoA Estimation Using Acoustic Intensity Features for Ambisonics Recordings , 2019, IEEE Journal of Selected Topics in Signal Processing.

[3] Soumitro Chakrabarty,et al. Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained With Noise Signals , 2018, IEEE Journal of Selected Topics in Signal Processing.

[4] Scott Rickard,et al. Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[5] Edsger W. Dijkstra,et al. A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[6] Tara N. Sainath,et al. Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction , 2016, INTERSPEECH.

[7] Zbynek Koldovský,et al. Independent Vector Analysis Exploiting Pre-learned Banks of Relative Transfer Functions for Assumed Target's Positions , 2018, LVA/ICA.

[8] Sharon Gannot,et al. Spatial Source Subtraction Based on Incomplete Measurements of Relative Transfer Function , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[9] Rajesh M. Hegde,et al. Robust source localization and tracking using MUSIC-Group delay spectrum over spherical arrays , 2013, 2013 5th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[10] Vikas Sindhwani,et al. Vector-valued Manifold Regularization , 2011, ICML.

[11] Kazunori Komatani,et al. Discriminative multiple sound source localization based on deep neural networks using independent location model , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[12] Iván V. Meza,et al. Localization of sound sources in robotics: A review , 2017, Robotics Auton. Syst..

[13] Daniel P. W. Ellis,et al. Model-Based Expectation-Maximization Source Separation and Localization , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[14] Ann B. Lee,et al. Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] R. Coifman,et al. Anisotropic diffusion on sub-manifolds with application to Earth structure classification , 2012 .

[16] Sharon Gannot,et al. A Hybrid Approach for Speaker Tracking Based on TDOA and Data-Driven Models , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[17] Bin Yang,et al. Disambiguation of TDOA Estimation for Multiple Sources in Reverberant Environments , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[18] Mikhail Belkin,et al. Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[19] Thomas Kailath,et al. ESPRIT-estimation of signal parameters via rotational invariance techniques , 1989, IEEE Trans. Acoust. Speech Signal Process..

[20] Sharon Gannot,et al. Tree-Based Recursive Expectation-Maximization Algorithm for Localization of Acoustic Sources , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[21] Emmanuel Vincent,et al. A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22] B. Nadler,et al. Diffusion maps, spectral clustering and reaction coordinates of dynamical systems , 2005, math/0503445.

[23] A. Moiseff,et al. An artificial neural network for sound localization using binaural cues. , 1996, The Journal of the Acoustical Society of America.

[24] Sharon Gannot,et al. Distributed Expectation-Maximization Algorithm for Speaker Localization in Reverberant Environments , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[25] Steven van de Par,et al. A Probabilistic Model for Robust Localization Based on a Binaural Auditory Front-End , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[26] R. O. Schmidt,et al. Multiple emitter location and signal Parameter estimation , 1986 .

[27] Nilesh Madhu. A SCALABLE FRAMEWORK FOR MULTIPLE SPEAKER LOCALIZATION AND TRACKING , 2008 .

[28] Tetsuya Ogata,et al. Sound Source Localization Using Deep Learning Models , 2017, J. Robotics Mechatronics.

[29] P. Bérard,et al. Embedding Riemannian manifolds by their heat kernel , 1994 .

[30] Sharon Gannot,et al. Semi-Supervised Source Localization on Multiple Manifolds With Distributed Microphones , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[31] Tara N. Sainath,et al. Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition , 2016, INTERSPEECH.

[32] Bernhard Schölkopf,et al. A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[33] Israel Cohen,et al. Relative Transfer Function Identification Using Convolutive Transfer Function Approximation , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[34] Darren B. Ward,et al. Particle filtering algorithms for tracking an acoustic source in a reverberant environment , 2003, IEEE Trans. Speech Audio Process..

[35] José M. F. Moura,et al. Big Data Analysis with Signal Processing on Graphs: Representation and processing of massive data sets with irregular structure , 2014, IEEE Signal Processing Magazine.

[36] Ying Yu,et al. A Real-Time SRP-PHAT Source Location Implementation using Stochastic Region Contraction(SRC) on a Large-Aperture Microphone Array , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[37] Sharon Gannot,et al. Multiple Speaker Localization using Mixture of Gaussian Model with Manifold-based Centroids , 2021, 2020 28th European Signal Processing Conference (EUSIPCO).

[38] Heng Tao Shen,et al. Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[39] Jacob Benesty,et al. Linearly Constrained Minimum Variance Source Localization and Spectral Estimation , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[40] N. Gordon,et al. Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[41] Larry S. Davis,et al. Multimodal 3-D tracking and event detection via the particle filter , 2001, Proceedings IEEE Workshop on Detection and Recognition of Events in Video.

[42] Ehud Weinstein,et al. System identification using nonstationary signals , 1996, IEEE Trans. Signal Process..

[43] André van Schaik,et al. Room acoustics simulation for multichannel microphone arrays , 2010 .

[44] Sofiène Affes,et al. A signal subspace tracking algorithm for microphone array processing of speech , 1997, IEEE Trans. Speech Audio Process..

[45] Sharon Gannot,et al. An Online Multiple-speaker DOA Tracking Using the CappÉ-Moulines Recursive Expectation-maximization Algorithm , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[46] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.

[47] Larry S. Davis,et al. Joint Audio-Visual Tracking Using Particle Filters , 2002, EURASIP J. Adv. Signal Process..

[48] Sharon Gannot,et al. Time difference of arrival estimation of speech source in a noisy and reverberant environment , 2005, Signal Process..

[49] Thushara D. Abhayapala,et al. Spatial feature learning for robust binaural sound source localization using a composite feature vector , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[50] Sharon Gannot,et al. Multiple Speaker Tracking Using Coupled HMM in the STFT Domain , 2019, 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[51] J. Kruskal. Nonmetric multidimensional scaling: A numerical method , 1964 .

[52] Sharon Gannot,et al. Combined LCMV-TRINICON Beamforming for Separating Multiple Speech Sources in Noisy and Reverberant Environments , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[53] J. Polack. Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics , 1993 .

[54] Sharon Gannot,et al. Speaker Tracking Using Recursive EM Algorithms , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[55] Daniel P. W. Ellis,et al. An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments , 2006, NIPS.

[56] Michael S. Brandstein,et al. Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[57] Pierre Vandergheynst,et al. Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[58] Sharon Gannot,et al. Localization of an Unknown Number of Speakers in Adverse Acoustic Conditions Using Reliability Information and Diarization , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[59] Petr Motlícek,et al. Deep Neural Networks for Multiple Speaker Detection and Localization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[60] Rémi Gribonval,et al. Hearing behind walls: Localizing sources in the room next door with cosparsity , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[61] Haizhou Li,et al. A learning-based approach to direction of arrival estimation in noisy and reverberant environments , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[62] Mikhail Belkin,et al. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[63] Benesty,et al. Adaptive eigenvalue decomposition algorithm for passive acoustic source localization , 2000, The Journal of the Acoustical Society of America.

[64] E. Habets,et al. Generating sensor signals in isotropic noise fields. , 2007, The Journal of the Acoustical Society of America.

[65] Guy J. Brown,et al. Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions , 2015, INTERSPEECH.

[66] A. Benjamin Premkumar,et al. A distributed particle filtering approach for multiple acoustic source tracking using an acoustic vector sensor network , 2015, Signal Process..

[67] Mikhail Belkin,et al. Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[68] Michael S. Brandstein,et al. A closed-form location estimator for use with room environment microphone arrays , 1997, IEEE Trans. Speech Audio Process..

[69] Sharon Gannot,et al. Geometrically Constrained TRINICON-based relative transfer function estimation in underdetermined scenarios , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[70] Ehud Weinstein,et al. Signal enhancement using beamforming and nonstationarity with applications to speech , 2001, IEEE Trans. Signal Process..

[71] Israel Cohen,et al. Multichannel Eigenspace Beamforming in a Reverberant Noisy Environment With Multiple Interfering Speech Signals , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[72] Sharon Gannot,et al. Source Counting and Separation Based on Simplex Analysis , 2018, IEEE Transactions on Signal Processing.

[73] Ronen Talmon,et al. Empirical intrinsic geometry for nonlinear modeling and time series filtering , 2013, Proceedings of the National Academy of Sciences.

[74] Michael C. Hout,et al. Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[75] Stéphane Lafon,et al. Diffusion maps , 2006 .

[76] Hiroshi G. Okuno,et al. Multiple Sound Source Position Estimation by Drone Audition Based on Data Association Between Sound Source Localization and Identification , 2020, IEEE Robotics and Automation Letters.

[77] Thushara D. Abhayapala,et al. Coherent broadband source localization by modal space processing , 2003, 10th International Conference on Telecommunications, 2003. ICT 2003..

[78] Karl Pearson F.R.S.. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[79] M. Maggioni,et al. Manifold parametrizations by eigenfunctions of the Laplacian and heat kernels , 2008, Proceedings of the National Academy of Sciences.

[80] Rémi Gribonval,et al. Joint estimation of sound source location and boundary impedance with physics-driven cosparse regularization , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[81] François Michaud,et al. Lightweight and Optimized Sound Source Localization and Tracking Methods for Open and Closed Microphone Array Configurations , 2018, Robotics Auton. Syst..

[82] Vittorio Murino,et al. A Unifying Framework in Vector-valued Reproducing Kernel Hilbert Spaces for Manifold Regularization and Co-Regularized Multi-view Learning , 2014, J. Mach. Learn. Res..

[83] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .

[84] Marc Moonen,et al. Robust Adaptive Time Delay Estimation for Speaker Localization in Noisy and Reverberant Acoustic Environments , 2003, EURASIP J. Adv. Signal Process..

[85] Ye Tian,et al. Distributed IMM-Unscented Kalman Filter for Speaker Tracking in Microphone Array Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[86] G. Carter,et al. The generalized correlation method for estimation of time delay , 1976 .

[87] Israel Cohen,et al. Subspace tracking of multiple sources and its application to speakers extraction , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[88] Benoît Champagne,et al. A new cepstral prefiltering technique for estimating time delay under reverberant conditions , 1997, Signal Process..

[89] Prasanga N. Samarasinghe,et al. Unsupervised Multiple Source Localization Using Relative Harmonic Coefficients , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[90] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[91] Radu Horaud,et al. Variational EM for binaural sound-source separation and localization , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[92] Sharon Gannot,et al. Deep Ranking-Based Sound Source Localization , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[93] Kazunori Komatani,et al. Unsupervised adaptation of deep neural networks for sound source localization using entropy minimization , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[94] Brahim Chaib-draa,et al. A KNN Based Kalman Filter Gaussian Process Regression , 2013, IJCAI.

[95] Radu Horaud,et al. Acoustic Space Learning for Sound-Source Separation and Localization on Binaural Manifolds , 2014, Int. J. Neural Syst..

[96] Jacob Benesty,et al. Passive acoustic source localization for video camera steering , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[97] Yonggang Hu,et al. Sound Source Localization Using Relative Harmonic Coefficients in Modal Domain , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[98] Sharon Gannot,et al. Microphone Array Speaker Localizers Using Spatial-Temporal Information , 2006, EURASIP J. Adv. Signal Process..

[99] Kazunori Komatani,et al. Sound source localization based on deep neural networks with directional activate function exploiting phase information , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[100] L. Rosasco,et al. Manifold Regularization , 2007 .

[101] Sharon Gannot,et al. Speaker Tracking on Multiple-Manifolds with Distributed Microphones , 2017, LVA/ICA.

[102] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[103] Tara N. Sainath,et al. Raw Multichannel Processing Using Deep Neural Networks , 2017, New Era for Robust Speech Recognition, Exploiting Deep Learning.

[104] Sharon Gannot,et al. Performance analysis of the covariance-whitening and the covariance-subtraction methods for estimating the relative transfer function , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[105] Pietro Perona,et al. Self-Tuning Spectral Clustering , 2004, NIPS.

[106] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[107] Fuliang Yin,et al. Speaker Tracking Based on Distributed Particle Filter in Distributed Microphone Networks , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[108] Neviano Dal Degan,et al. Acoustic noise analysis and speech enhancement techniques for mobile radio applications , 1988 .

[109] Israel Cohen,et al. Dual-Source Transfer-Function Generalized Sidelobe Canceller , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[110] Andrew Blake,et al. Nonlinear filtering for speaker tracking in noisy and reverberant environments , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[111] Sharon Gannot,et al. Manifold-based Bayesian inference for semi-supervised source localization , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[112] Yong Rui,et al. Time delay estimation in the presence of correlated noise and reverberation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[113] Ann B. Lee,et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[114] Ronald R. Coifman,et al. Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators , 2005, NIPS.

[115] A. Berlinet,et al. Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[116] Gerhard Schmidt,et al. Acoustic echo control. An application of very-high-order adaptive filters , 1999, IEEE Signal Process. Mag..

[117] Israel Cohen,et al. Multi-View Source Localization Based on Power Ratios , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[118] Wei Chu,et al. Semi-Supervised Gaussian Process Classifiers , 2007, IJCAI.

[119] Richard Heusdens,et al. DOA estimation of audio sources in reverberant environments , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[120] Emanuel A. P. Habets,et al. Multiple-Hypothesis Extended Particle Filter for Acoustic Source Localization in Reverberant Environments , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[121] Radu Horaud,et al. Exploiting the intermittency of speech for joint separation and diarization , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[122] Emanuel A. P. Habets,et al. Broadband doa estimation using convolutional neural networks trained with noise signals , 2017, 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[123] Peter Vary,et al. Multichannel audio database in various acoustic environments , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[124] Jeffrey K. Uhlmann,et al. New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.

[125] Radu Horaud,et al. Audio-visual speaker localization via weighted clustering , 2014, 2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[126] James R. Hopgood,et al. Nonconcurrent multiple speakers tracking based on extended Kalman particle filter , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[127] Søren Holdt Jensen,et al. Nonlinear Least Squares Methods for Joint DOA and Pitch Estimation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[128] Matthias Hein,et al. Intrinsic dimensionality estimation of submanifolds in Rd , 2005, ICML.

[129] Michael S. Brandstein,et al. A robust method for speech signal time-delay estimation in reverberant rooms , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[130] Hiroaki Kitano,et al. Real-time sound source localization and separation for robot audition , 2002, INTERSPEECH.

[131] B. Nadler,et al. Diffusion Maps - a Probabilistic Interpretation for Spectral Embedding and Clustering Algorithms , 2008 .

[132] David Barber,et al. Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[133] Sharon Gannot,et al. Relative transfer function modeling for supervised source localization , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[134] Jean Rouat,et al. Robust sound source localization using a microphone array on a mobile robot , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[135] Sharon Gannot,et al. Relative transfer function identification on manifolds for supervised GSC beamformers , 2013, 21st European Signal Processing Conference (EUSIPCO 2013).

[136] Sharon Gannot,et al. Learning-Based Acoustic Source Localization Using Directional Spectra , 2019, 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[137] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[138] Alastair H. Moore,et al. Bearing-only acoustic tracking of moving speakers for robot audition , 2015, 2015 IEEE International Conference on Digital Signal Processing (DSP).

[139] Laurent Girin,et al. Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization With Spatial Sparsity Regularization , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[140] P. Peterson. Simulating the response of multiple microphones to a single acoustic source in a reverberant room. , 1986, The Journal of the Acoustical Society of America.

[141] Y. Bar-Shalom,et al. The interacting multiple model algorithm for systems with Markovian switching coefficients , 1988 .

[142] Archontis Politis,et al. Direction of Arrival Estimation for Multiple Sound Sources Using Convolutional Recurrent Neural Network , 2017, 2018 26th European Signal Processing Conference (EUSIPCO).

[143] D. Donoho,et al. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[144] Sharon Gannot,et al. Diarization and Separation Based on a Data-Driven Simplex , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[145] Pascal Frossard,et al. The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[146] I. Cohen,et al. Generating nonstationary multisensor signals under a spatial coherence constraint. , 2008, The Journal of the Acoustical Society of America.

[147] Emanuel A. P. Habets,et al. An Informed Parametric Spatial Filter Based on Instantaneous Direction-of-Arrival Estimates , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[148] Sharon Gannot,et al. A Study on Manifolds of Acoustic Responses , 2015, LVA/ICA.

[149] J. Mercer. Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[150] Iain Murray. Introduction To Gaussian Processes , 2008 .

[151] H. C. Schau,et al. Passive source localization employing intersecting spherical surfaces from time-of-arrival differences , 1987, IEEE Trans. Acoust. Speech Signal Process..

[152] Tara N. Sainath,et al. Factored spatial and spectral multichannel raw waveform CLDNNs , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[153] Roy R. Lederman,et al. Learning the geometry of common latent variables using alternating-diffusion , 2015 .

[154] Michael Beard,et al. Performance of PHD and CPHD filtering versus JIPDA for bearings-only multi-target tracking , 2012, 2012 15th International Conference on Information Fusion.

[155] Patrick A. Naylor,et al. DoA Reliability for Distributed Acoustic Tracking , 2018, IEEE Signal Processing Letters.

[156] Tara N. Sainath,et al. Speaker location and microphone spacing invariant acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[157] José Santos-Victor,et al. Sound Localization for Humanoid Robots - Building Audio-Motor Maps based on the HRTF , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[158] Sharon Gannot,et al. A real-life experimental study on semi-supervised source localization based on manifold regularization , 2016, 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE).

[159] Kung Yao,et al. Maximum-likelihood acoustic source localization: Experimental results , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[160] Ronald R. Coifman,et al. Texture separation via a reference set , 2014 .

[161] Radu Horaud,et al. 2D sound-source localization on the binaural manifold , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[162] Tara N. Sainath,et al. Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[163] Ba-Ngu Vo,et al. Tracking an unknown time-varying number of speakers using TDOA measurements: a random finite set approach , 2006, IEEE Transactions on Signal Processing.

[164] Haizhou Li,et al. An expectation-maximization eigenvector clustering approach to direction of arrival estimation of multiple speech sources , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[165] Guy J. Brown,et al. Exploiting Deep Neural Networks and Head Movements for Robust Binaural Localization of Multiple Sources in Reverberant Environments , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[166] R. Coifman,et al. Non-linear independent component analysis with diffusion maps , 2008 .

[167] Mikhail Belkin,et al. Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[168] Anthony G. Constantinides,et al. Audio–Visual Active Speaker Tracking in Cluttered Indoors Environments , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[169] J. Fourier. Théorie analytique de la chaleur , 2009 .

[170] Boaz Rafaely,et al. Localization of Multiple Speakers under High Reverberation using a Spherical Microphone Array and the Direct-Path Dominance Test , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[171] N. Aronszajn. Theory of Reproducing Kernels. , 1950 .

[172] Søren Holdt Jensen,et al. On frequency domain models for TDOA estimation , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[173] GannotSharon,et al. Semi-supervised sound source localization based on manifold regularization , 2016 .