Conjugate Mixture Models for the Modeling of Visual and Auditory Perception. (Modèles de Mélanges Conjugués pour la Modélisation de la Perception Visuelle et Auditive)
暂无分享,去创建一个
[1] Naonori Ueda,et al. Deterministic annealing EM algorithm , 1998, Neural Networks.
[2] Radu Horaud,et al. Detection and localization of 3d audio-visual objects using unsupervised clustering , 2008, ICMI '08.
[3] Alfred O. Hero,et al. Kullback proximal algorithims for maximum-likelihood estimation , 2000, IEEE Trans. Inf. Theory.
[4] Khalid Choukri,et al. The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms , 2007, Lang. Resour. Evaluation.
[5] A. Pouget,et al. Multisensory spatial representations in eye-centered coordinates for reaching , 2002, Cognition.
[6] Ivan Laptev,et al. On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.
[7] Peter K. Allen,et al. Integrating Vision and Touch for Object Recognition Tasks , 1988, Int. J. Robotics Res..
[8] Geoffrey J. McLachlan,et al. Robust mixture modelling using the t distribution , 2000, Stat. Comput..
[9] José A. Castellanos,et al. Mobile Robot Localization and Map Building: A Multisensor Fusion Approach , 2000 .
[10] T. Stanford,et al. Superadditivity in multisensory integration: putting the computation in context. , 2007, Neuroreport.
[11] Mark T. Wallace,et al. The influence of visual and auditory receptive field organization on multisensory integration in the superior colliculus , 2001, Experimental Brain Research.
[12] Tomás Svoboda,et al. A Convenient Multicamera Self-Calibration for Virtual Environments , 2005, Presence: Teleoperators & Virtual Environments.
[13] Mary P. Harper,et al. VACE Multimodal Meeting Corpus , 2005, MLMI.
[14] A. A. Zhigli︠a︡vskiĭ,et al. Stochastic Global Optimization , 2007 .
[15] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .
[16] Michael I. Miller,et al. REPRESENTATIONS OF KNOWLEDGE IN COMPLEX SYSTEMS , 1994 .
[17] Jean-Philippe Thiran,et al. The BANCA Database and Evaluation Protocol , 2003, AVBPA.
[18] Ivan Himawan,et al. Microphone Array Shape Calibration in Diffuse Noise Fields , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[19] A. Zhigljavsky. Stochastic Global Optimization , 2008, International Encyclopedia of Statistical Science.
[20] M. Jacobsen. Point Process Theory and Applications: Marked Point and Piecewise Deterministic Processes , 2005 .
[21] Larry S. Davis,et al. Joint Audio-Visual Tracking Using Particle Filters , 2002, EURASIP J. Adv. Signal Process..
[22] Hugh F. Durrant-Whyte,et al. Multisensor data fusion for underwater navigation , 2001, Robotics Auton. Syst..
[23] Lei Xu. Comparative Analysis on Convergence Rates of The EM Algorithm and Its Two Modifications for Gaussian Mixtures , 2004, Neural Processing Letters.
[24] Brian R Glasberg,et al. Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.
[25] Guy J. Brown,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .
[26] Kevin P. Murphy,et al. Dynamic Bayesian Networks for Audio-Visual Speech Recognition , 2002, EURASIP J. Adv. Signal Process..
[27] A. King,et al. Multisensory Integration: Strategies for Synchronization , 2005, Current Biology.
[28] Herman Bruyninckx,et al. Kalman filters for non-linear systems: a comparison of performance , 2004 .
[29] D. C. Higgins. Human Spatial Orientation , 1967, The Yale Journal of Biology and Medicine.
[30] Manuel Yguel,et al. Efficient GPU-based Construction of Occupancy Grids Using several Laser Range-finders , 2008 .
[31] Rainer Stiefelhagen,et al. Audio-visual multi-person tracking and identification for smart environments , 2007, ACM Multimedia.
[32] Sophie M. Wuerger,et al. Low-level integration of auditory and visual motion signals requires spatial co-localisation , 2005, Experimental Brain Research.
[33] Jean-Marc Odobez,et al. Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[34] Trevor Darrell,et al. Learning Joint Statistical Models for Audio-Visual Fusion and Segregation , 2000, NIPS.
[35] Patrick Pérez,et al. Data fusion for visual tracking with particles , 2004, Proceedings of the IEEE.
[36] Marina Meila,et al. An Experimental Comparison of Model-Based Clustering Methods , 2004, Machine Learning.
[37] Trevor Darrell,et al. Multiple person and speaker activity tracking with a particle filter , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[38] Radu Horaud,et al. Cyclopean Geometry of Binocular Vision , 2008, Journal of the Optical Society of America. A, Optics, image science, and vision.
[39] R. Patterson,et al. Complex Sounds and Auditory Images , 1992 .
[40] Mark T. Wallace,et al. The influence of visual and auditory receptive field organization on multisensory integration in the superior colliculus , 2002, Experimental Brain Research.
[41] Gilles Celeux,et al. EM procedures using mean field-like approximations for Markov model-based image segmentation , 2003, Pattern Recognit..
[42] N. Kampen,et al. Stochastic processes in physics and chemistry , 1981 .
[43] F. Downton. Stochastic Approximation , 1969, Nature.
[44] John W. McDonough,et al. A joint particle filter for audio-visual speaker tracking , 2005, ICMI '05.
[45] Jean Ponce,et al. Computer Vision: A Modern Approach , 2002 .
[46] Vladimir Pavlovic,et al. Boosted learning in dynamic Bayesian networks for multimodal speaker detection , 2003, Proc. IEEE.
[47] Ryosuke Shibasaki,et al. Multi-modal tracking of people using laser scanners and video camera , 2008, Image Vis. Comput..
[48] Sethu Vijayakumar,et al. Structure Inference for Bayesian Multisensory Perception and Tracking , 2007, IJCAI.
[49] Sethu Vijayakumar,et al. Structure Inference for Bayesian Multisensor Scene Understanding , 2007 .
[50] Andrew Y. Ng,et al. Integrating Visual and Range Data for Robotic Object Detection , 2008, ECCV 2008.
[51] Ramani Duraiswami,et al. Automatic position calibration of multiple microphones , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[52] Jonathan G. Fiscus,et al. The NIST Meeting Room Corpus 2 Phase 1 , 2006, MLMI.
[53] Dongming Zhao,et al. Unscented Kalman filter for non-linear estimation , 2006 .
[54] Christophe Biernacki,et al. Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models , 2003, Comput. Stat. Data Anal..
[55] M. Ernst,et al. Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.
[56] Thomas J. Anastasio,et al. Using Bayes' Rule to Model Multisensory Enhancement in the Superior Colliculus , 2000, Neural Computation.
[57] D. Wang,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2008, IEEE Trans. Neural Networks.
[58] Jiri Matas,et al. XM2VTSDB: The Extended M2VTS Database , 1999 .
[59] Gérard Govaert,et al. Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..
[60] Ning Ma,et al. Integrating pitch and localisation cues at a speech fragment level , 2007, INTERSPEECH.
[61] E. Coiras,et al. Rigid data association for shallow water surveys , 2007 .
[62] Trevor Darrell,et al. Speaker association with signal-level audiovisual fusion , 2004, IEEE Transactions on Multimedia.
[63] Martin Cooke,et al. Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners , 2011, Speech Commun..
[64] A. King,et al. The superior colliculus , 2004, Current Biology.
[65] Yoav Y. Schechner,et al. Harmony in Motion , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[66] Jinwen Ma,et al. Asymptotic Convergence Rate of the EM Algorithm for Gaussian Mixtures , 2000, Neural Computation.
[67] Mikhail Borisovich Nevelʹson,et al. Stochastic Approximation and Recursive Estimation , 1976 .
[68] D. Burr,et al. Combining visual and auditory information. , 2006, Progress in brain research.
[69] W. Ebeling. Stochastic Processes in Physics and Chemistry , 1995 .
[70] Josh H. McDermott. The cocktail party problem , 2009, Current Biology.
[71] Arthur C. Sanderson,et al. Multisensor Fusion - A Minimal Representation Framework , 1999, Series in Intelligent Control and Intelligent Automation.
[72] A. A. Zhigli︠a︡vskiĭ,et al. Theory of Global Random Search , 1991 .
[73] Vladimir Katkovnik,et al. Spatially Adaptive Estimation via Fitted Local Likelihood Techniques , 2008, IEEE Transactions on Signal Processing.
[74] H Colonius,et al. A two-stage model for visual-auditory interaction in saccadic latencies , 2001, Perception & psychophysics.
[75] Malcolm Slaney,et al. An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank , 1997 .
[76] P. Deb. Finite Mixture Models , 2008 .
[77] James R. Glass,et al. A segment-based audio-visual speech recognizer: data collection, development, and initial experiments , 2004, ICMI '04.
[78] C. Faller,et al. Source localization in complex listening situations: selection of binaural cues based on interaural coherence. , 2004, The Journal of the Acoustical Society of America.
[79] H. Akaike,et al. Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .
[80] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[81] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.
[82] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[83] Daniel P. W. Ellis,et al. An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments , 2006, NIPS.
[84] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[85] B. Pannetier,et al. Improvement of Multiple Ground Targets Tracking with GMTI Sensor and Fusion of Identification Attributes , 2008, 2008 IEEE Aerospace Conference.
[86] J.N. Gowdy,et al. CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[87] Samy Bengio,et al. Modeling human interaction in meetings , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[88] E. Gassiat. Likelihood ratio inequalities with applications to various mixtures , 2002 .
[89] Frank Dellaert,et al. MCMC-based particle filtering for tracking a variable number of interacting targets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[90] R. Horaud,et al. Audio-Visual Clustering for Multiple Speaker Localization , 2008 .
[91] M. Alex Meredith,et al. Neurons and behavior: the same rules of multisensory integration apply , 1988, Brain Research.
[92] Jon Barker,et al. Stream weight estimation for multistream audio-visual speech recognition in a multispeaker environment , 2008, Speech Commun..
[93] D. L. Hall,et al. Mathematical Techniques in Multisensor Data Fusion , 1992 .
[94] Jean-Marc Odobez,et al. AV16.3: An Audio-Visual Corpus for Speaker Localization and Tracking , 2004, MLMI.
[95] T. Stanford,et al. Multisensory integration: current issues from the perspective of the single neuron , 2008, Nature Reviews Neuroscience.
[96] Jian Yao,et al. Multi-Camera Multi-Person 3D Space Tracking with MCMC in Surveillance Scenarios , 2008, ECCV 2008.
[97] J. Idier,et al. Penalized Maximum Likelihood Estimator for Normal Mixtures , 2000 .
[98] Roberto Brunelli,et al. A Generative Approach to Audio-Visual Person Tracking , 2006, CLEAR.
[99] Patrick Pérez,et al. Sequential Monte Carlo Fusion of Sound and Vision for Speaker Tracking , 2001, ICCV.
[100] Yong Rui,et al. Real-time speaker tracking using particle filter sensor fusion , 2004, Proceedings of the IEEE.
[101] Martin Cooke,et al. Modelling auditory processing and organisation , 1993, Distinguished dissertations in computer science.
[102] Radu Horaud,et al. Conjugate Mixture Models for Clustering Multimodal Data , 2011, Neural Computation.
[103] S. P. Mudur,et al. Three-dimensional computer vision: a geometric viewpoint , 1993 .
[104] David C. Knill. Bayesian Models of Sensory Cue Integration , 2006 .
[105] S. Wuerger,et al. Cross-modal integration of auditory and visual motion signals , 2001, Neuroreport.
[106] Ren C. Luo,et al. Multisensor fusion and integration: approaches, applications, and future research directions , 2002 .
[107] G. Celeux,et al. An entropy criterion for assessing the number of clusters in a mixture model , 1996 .
[108] Trevor Darrell,et al. Audio-video array source separation for perceptual user interfaces , 2001, PUI '01.
[109] Sameer Singh,et al. Approaches to Multisensor Data Fusion in Target Tracking: A Survey , 2006, IEEE Transactions on Knowledge and Data Engineering.
[110] Alfred O. Hero,et al. On EM algorithms and their proximal generalizations , 2008, 1201.5912.
[111] Alexandre Pouget,et al. A computational perspective on the neural basis of multisensory spatial representations , 2002, Nature Reviews Neuroscience.
[112] Nebojsa Jojic,et al. A Graphical Model for Audiovisual Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..
[113] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.
[114] J. Lewald,et al. Cross-modal perceptual integration of spatially and temporally disparate auditory and visual stimuli. , 2003, Brain research. Cognitive brain research.
[115] C. Spence,et al. Crossmodal Space and Crossmodal Attention , 2004 .
[116] Michael S. Brandstein,et al. Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.
[117] Martin Heckmann,et al. Noise Adaptive Stream Weighting in Audio-Visual Speech Recognition , 2002, EURASIP J. Adv. Signal Process..
[118] Roland Siegwart,et al. Multimodal detection and tracking of pedestrians in urban environments with explicit ground plane extraction , 2008 .
[119] R. E. Kalman,et al. New Results in Linear Filtering and Prediction Theory , 1961 .
[120] Zhihong Zeng,et al. Audio-Visual Affect Recognition , 2007, IEEE Transactions on Multimedia.
[121] Arnak S. Dalalyan,et al. A global camera network calibration method with Linear Programming , 2010 .
[122] G. McLachlan,et al. The EM algorithm and extensions , 1996 .
[123] Carlo Tomasi,et al. Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[124] Jean Ponce,et al. Audio-Visual Speaker Localization Using Graphical Models , 2006, 18th International Conference on Pattern Recognition (ICPR'06).
[125] Radu Horaud,et al. Patterns of Binocular Disparity for a Fixating Observer , 2007, BVAI.
[126] Ronald Mahler. A general theory of multitarget extended Kalman filters , 2005, SPIE Defense + Commercial Sensing.
[127] Sidney S. Simon,et al. Merging of the Senses , 2008, Front. Neurosci..
[128] S. M. Ermakow. Die Monte-Carlo-Methode und verwandte Fragen , 1975 .
[129] Michael F. Cohen,et al. Fourier Analysis of the 2D Screened Poisson Equation for Gradient Domain Problems , 2008, ECCV.
[130] Jon Barker,et al. The CAVA corpus: synchronised stereoscopic and binaural datasets with head movements , 2008, ICMI '08.
[131] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.