Joint Alignment of Multiple Point Sets with Batch and Incremental Expectation-Maximization

This paper addresses the problem of registering multiple point sets. Solutions to this problem are often approximated by repeatedly solving for pairwise registration, which results in an uneven treatment of the sets forming a pair: a model set and a data set. The main drawback of this strategy is that the model set may contain noise and outliers, which negatively affects the estimation of the registration parameters. In contrast, the proposed formulation treats all the point sets on an equal footing. Indeed, all the points are drawn from a central Gaussian mixture, hence the registration is cast into a clustering problem. We formally derive batch and incremental EM algorithms that robustly estimate both the GMM parameters and the rotations and translations that optimally align the sets. Moreover, the mixture's means play the role of the registered set of points while the variances provide rich information about the contribution of each component to the alignment. We thoroughly test the proposed algorithms on simulated data and on challenging real data collected with range sensors. We compare them with several state-of-the-art algorithms, and we show their potential for surface reconstruction from depth data.

[1]  R. Maas,et al.  A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research , 2016, EURASIP Journal on Advances in Signal Processing.

[2]  John B. Moore,et al.  Optimisation-on-a-manifold for global registration of multiple 3D point sets , 2007, Int. J. Intell. Syst. Technol. Appl..

[3]  Baba C. Vemuri,et al.  Robust Point Set Registration Using Gaussian Mixture Models , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Francis Schmitt,et al.  A Solution for the Registration of Multiple 3D Point Sets Using Unit Quaternions , 1998, ECCV.

[5]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Marc Moonen,et al.  Subspace Methods for Multimicrophone Speech Dereverberation , 2003, EURASIP J. Adv. Signal Process..

[7]  Xiang Lin,et al.  Two-stage blind identification of SIMO systems with common zeros , 2006, 2006 14th European Signal Processing Conference.

[8]  Xiang Lin,et al.  A Forced Spectral Diversity Algorithm for Speech Dereverberation in the Presence of Near-Common Zeros , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Emanuel A. P. Habets,et al.  Robust Multichannel Dereverberation using Relaxed Multichannel Least Squares , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10]  Martin D. Levine,et al.  Registering Multiview Range Data to Create 3D Computer Objects , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Richard C. Hendriks,et al.  Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Hongdong Li,et al.  Rotation Averaging , 2013, International Journal of Computer Vision.

[13]  Robust Registration of 2 D and 3 D Point Sets , 2001 .

[14]  Stefan Goetze,et al.  Regularization for Partial Multichannel Equalization for Speech Dereverberation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Didier Stricker,et al.  Algorithms for 3D Shape Scanning with a Depth Camera , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Radu Horaud,et al.  A Generative Model for the Joint Registration of Multiple Point Sets , 2014, ECCV.

[17]  Jacob Goldberger,et al.  Registration of multiple point sets using the EM algorithm , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[18]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[19]  Pavel Krsek,et al.  The Trimmed Iterative Closest Point algorithm , 2002, Object recognition supported by user interaction for service robots.

[20]  Gérard G. Medioni,et al.  Object modelling by registration of multiple range images , 1992, Image Vis. Comput..

[21]  William M. Wells,et al.  Statistical Approaches to Feature-Based Object Recognition , 2004, International Journal of Computer Vision.

[22]  Takuya Yoshioka,et al.  Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[23]  Martial Hebert,et al.  Fully automatic registration of multiple 3D data sets , 2003, Image Vis. Comput..

[24]  Biing-Hwang Juang,et al.  Blind speech dereverberation with multi-channel linear prediction based on short time fourier transform representation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25]  Radu Horaud,et al.  Automatic detection of calibration grids in time-of-flight images , 2014, Comput. Vis. Image Underst..

[26]  Anand Rangarajan,et al.  A new point matching algorithm for non-rigid registration , 2003, Comput. Vis. Image Underst..

[27]  Robert Bergevin,et al.  Towards a General Multi-View Registration Technique , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Michael Felsberg,et al.  A Probabilistic Framework for Color-Based Point Set Registration , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[30]  Chen Li,et al.  Improved Techniques for Multi-view Registration with Motion Averaging , 2014, 2014 2nd International Conference on 3D Vision.

[31]  Alfred Mertins,et al.  Multi-Channel Room Impulse Response Shaping - A Study , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[32]  Stefan Leutenegger,et al.  ElasticFusion: Real-time dense SLAM and light source estimation , 2016, Int. J. Robotics Res..

[33]  Yasuyuki Matsushita,et al.  Efficient Large-Scale Point Cloud Registration Using Loop Closures , 2015, 2015 International Conference on 3D Vision.

[34]  Emanuel A. P. Habets,et al.  An Expectation-Maximization Algorithm for Multimicrophone Speech Dereverberation and Noise Reduction With Coherence Matrix Estimation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[35]  Takeshi Masuda,et al.  Registration and Integration of Multiple Range Images by Matching Signed Distance Fields for Object Shape Modeling , 2002, Comput. Vis. Image Underst..

[36]  Bill Gardner,et al.  HRTF Measurements of a KEMAR Dummy-Head Microphone , 1994 .

[37]  Sheng-Wen Shih,et al.  An Efficient and Accurate Method for the Relaxation of Multiview Registration Error , 2008, IEEE Transactions on Image Processing.

[38]  Ina Kodrasi,et al.  Robust sparsity-promoting acoustic multi-channel equalization for speech dereverberation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[39]  Andriy Myronenko,et al.  Point Set Registration: Coherent Point Drift , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Tiago H. Falk,et al.  Updating the SRMR-CI Metric for Improved Intelligibility Prediction for Cochlear Implant Users , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[41]  Pavel Krsek,et al.  Robust Euclidean alignment of 3D point sets: the trimmed iterative closest point algorithm , 2005, Image Vis. Comput..

[42]  Yasuyuki Matsushita,et al.  Robust Simultaneous 3D Registration via Rank Minimization , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[43]  Emanuel A. P. Habets,et al.  Late Reverberant Spectral Variance Estimation Based on a Statistical Model , 2009, IEEE Signal Processing Letters.

[44]  Takeo Kanade,et al.  A Correlation-Based Approach to Robust Point Set Registration , 2004, ECCV.

[45]  Hongbin Wang,et al.  Highly efficient incremental estimation of Gaussian mixture models for online data stream clustering , 2005, SPIE Defense + Commercial Sensing.

[46]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[47]  Walter Kellermann,et al.  Coherent-to-Diffuse Power Ratio Estimation for Dereverberation , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[48]  Naokazu Yokoya,et al.  A Robust Method for Registration and Segmentation of Multiple Range Images , 1995, Comput. Vis. Image Underst..

[49]  Paul Suetens,et al.  Robust point set registration using EM-ICP with information-theoretically optimal outlier handling , 2011, CVPR 2011.

[50]  Jacob Benesty,et al.  A class of frequency-domain adaptive approaches to blind multichannel identification , 2003, IEEE Trans. Signal Process..

[51]  Martin Vetterli,et al.  Adaptive filtering in subbands with critical sampling: analysis, experiments, and application to acoustic echo cancellation , 1992, IEEE Trans. Signal Process..

[52]  Jiaolong Yang,et al.  Go-ICP: Solving 3D Registration Efficiently and Globally Optimally , 2013, 2013 IEEE International Conference on Computer Vision.

[53]  Xavier Binefa,et al.  Bayesian perspective for the registration of multiple 3D views , 2014, Comput. Vis. Image Underst..

[54]  Radu Horaud,et al.  Rigid and Articulated Point Registration with Expectation Conditional Maximization , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Venu Madhav Govindu,et al.  Combining two-view constraints for motion estimation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[56]  S. Umeyama,et al.  Least-Squares Estimation of Transformation Parameters Between Two Point Patterns , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Radu Horaud,et al.  Audio source separation based on convolutive transfer function and frequency-domain lasso optimization , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[58]  Radu Horaud,et al.  Non-stationary noise power spectral density estimation based on regional statistics , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[59]  Venu Madhav Govindu,et al.  On Averaging Multiview Relations for 3D Scan Registration , 2014, IEEE Transactions on Image Processing.

[60]  Peter Vary,et al.  Multichannel audio database in various acoustic environments , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).

[61]  Ina Kodrasi,et al.  Joint Dereverberation and Noise Reduction Based on Acoustic Multi-Channel Equalization , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[62]  Xavier Pennec,et al.  Multi-scale EM-ICP: A Fast and Robust Approach for Surface Registration , 2002, ECCV.

[63]  John A. Williams,et al.  Simultaneous Registration of Multiple Corresponding Point Sets , 2001, Comput. Vis. Image Underst..

[64]  Andrea Torsello,et al.  Multiview registration via graph diffusion of dual quaternions , 2011, CVPR 2011.

[65]  John Wright,et al.  RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[66]  Anand Rangarajan,et al.  Simultaneous Nonrigid Registration of Multiple Point Sets and Atlas Construction , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  James P. Reilly,et al.  The complex subband decomposition and its application to the decimation of large adaptive filtering problems , 2002, IEEE Trans. Signal Process..

[68]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[69]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[70]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[71]  Tomohiro Nakatani,et al.  Probabilistic integration of diffuse noise suppression and dereverberation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[72]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[73]  Andrew W. Fitzgibbon,et al.  Robust Registration of 2D and 3D Point Sets , 2003, BMVC.

[74]  Jean Pierre Delmas,et al.  Blind channel approximation: effective channel order determination , 1998, Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284).

[75]  Radu Horaud,et al.  Cross-calibration of time-of-flight and colour cameras , 2014, Comput. Vis. Image Underst..

[76]  Sang Wook Lee,et al.  Multiview registration of 3D scenes by minimizing error between coordinate frames , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77]  Andrea Fusiello,et al.  Registration of Multiple Acoustic Range Views for Underwater Scene Reconstruction , 2002, Comput. Vis. Image Underst..

[78]  Wolfram Burgard,et al.  3-D Mapping With an RGB-D Camera , 2014, IEEE Transactions on Robotics.