Environmental sound processing and its applications
暂无分享,去创建一个
Tomoki Toda | Kazuya Takeda | Tomoki Hayashi | Koichi Miyazaki | K. Takeda | T. Toda | Tomoki Hayashi | Koichi Miyazaki
[1] E. C. Cherry. Some Experiments on the Recognition of Speech, with One and with Two Ears , 1953 .
[2] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[3] B.D. Van Veen,et al. Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.
[4] Albert S. Bregman,et al. The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .
[5] Judith C. Brown. Calculation of a constant Q spectral transform , 1991 .
[6] David K. Mellinger,et al. Event formation and separation in musical sound , 1992 .
[7] Barry Arons,et al. A Review of The Cocktail Party Effect , 1992 .
[8] Don H. Johnson,et al. Array Signal Processing: Concepts and Techniques , 1993 .
[9] Guy J. Brown,et al. Computational auditory scene analysis , 1994, Comput. Speech Lang..
[10] Pierre Comon,et al. Independent component analysis, A new concept? , 1994, Signal Process..
[11] Daniel Patrick Whittlesey Ellis,et al. Prediction-driven computational auditory scene analysis , 1996 .
[12] Daniel P. W. Ellis,et al. PREDICTION-DRIVEN COMPUTATIONAL AUDITORY SCENE ANALYSIS FOR DENSE SOUND MIXTURES , 1996 .
[13] J. Cardoso. Infomax and maximum likelihood for blind source separation , 1997, IEEE Signal Processing Letters.
[14] Richard A. Brown,et al. Introduction to random signals and applied kalman filtering (3rd ed , 2012 .
[15] Piotr Indyk,et al. Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.
[16] Toshiyuki Asahi,et al. Sound retrieval with intuitive verbal expressions , 1998 .
[17] Tomohiro Nakatani,et al. Sound Ontology for Computational Auditory Scence Analysis , 1998, AAAI/IAAI.
[18] Paris Smaragdis,et al. Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.
[19] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.
[20] Fausto Pellandini,et al. Automatic sound detection and recognition for noisy environment , 2000, 2000 10th European Signal Processing Conference.
[21] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.
[22] Kazuya Takeda,et al. Blind source separation combining frequency-domain ICA and beamforming , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[23] Michael F. Bunting,et al. The cocktail party phenomenon revisited: The importance of working memory capacity , 2001, Psychonomic bulletin & review.
[24] Claude E. Shannon,et al. A mathematical theory of communication , 1948, MOCO.
[25] J. Stephen Downie,et al. Music information retrieval , 2005, Annu. Rev. Inf. Sci. Technol..
[26] P. Smaragdis,et al. Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).
[27] Shiro Ikeda,et al. A METHOD OF ICA IN TIME-FREQUENCY DOMAIN , 2003 .
[28] Kiyohiro Shikano,et al. Blind Source Separation Combining Independent Component Analysis and Beamforming , 2003, EURASIP J. Adv. Signal Process..
[29] D. Hunter,et al. A Tutorial on MM Algorithms , 2004 .
[30] D. W. Scott. Outlier Detection and Clustering by Partial Mixture Modeling , 2004 .
[31] Hiroshi Sawada,et al. A robust and precise method for solving the permutation problem of frequency-domain blind source separation , 2004, IEEE Transactions on Speech and Audio Processing.
[32] Remco C. Veltkamp,et al. A Survey of Music Information Retrieval Systems , 2005, ISMIR.
[33] DeLiang Wang,et al. A Computational Auditory Scene Analysis System for Robust Speech Recognition , 2022 .
[34] Bernardo A. Huberman,et al. Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..
[35] Valentin Robu,et al. The Dynamics and Semantics of Collaborative Tagging , 2006, SAAW@ISWC.
[36] Te-Won Lee,et al. Independent Vector Analysis: An Extension of ICA to Multivariate Components , 2006, ICA.
[37] Hiroshi Sawada,et al. Blind Extraction of Dominant Target Sources Using ICA and Time-Frequency Masking , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[38] Johannes D. Krijnders,et al. CASSANDRA: audio-video sensor fusion for aggression detection , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.
[39] Andrey Temko,et al. Acoustic Event Detection: SVM-Based System and Evaluation Setup in CLEAR'07 , 2007, CLEAR.
[40] Augusto Sarti,et al. Scream and gunshot detection and localization for audio-surveillance systems , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.
[41] Manuele Bicego,et al. Audio-Visual Event Recognition in Surveillance Video Sequences , 2007, IEEE Transactions on Multimedia.
[42] Te-Won Lee,et al. Blind Source Separation Exploiting Higher-Order Frequency Dependencies , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[43] E. C. Cmm,et al. on the Recognition of Speech, with , 2008 .
[44] D. Wang,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006, IEEE Trans. Neural Networks.
[45] Asma Rabaoui,et al. Using One-Class SVMs and Wavelets for Audio Surveillance , 2008, IEEE Transactions on Information Forensics and Security.
[46] Marc Leman,et al. Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.
[47] Hirokazu Kameoka,et al. Complex NMF: A new sparse representation for acoustic signals , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[48] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.
[49] Ching-Yung Lin,et al. Healthcare audio event classification using Hidden Markov Models and Hierarchical Hidden Markov Models , 2009, 2009 IEEE International Conference on Multimedia and Expo.
[50] Hiroshi Sawada,et al. Blind sparse source separation for unknown number of sources using Gaussian mixture model fitting with Dirichlet prior , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[51] Nobutaka Ito,et al. Blind alignment of asynchronously recorded signals for distributed microphone array , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[52] Hirokazu Kameoka,et al. Nonnegative Matrix Factorization with Markov-Chained Bases for Modeling Time-Varying Patterns in Music Spectrograms , 2010, LVA/ICA.
[53] Mert Bay,et al. The Music Information Retrieval Evaluation eXchange: Some Observations and Insights , 2010, Advances in Music Information Retrieval.
[54] Alexey Ozerov,et al. Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[55] Andrzej Czyzewski,et al. Dangerous Sound Event Recognition Using Support Vector Machine Classifiers , 2010, MISSI.
[56] Annamaria Mesaros,et al. Sound Event Detection in Multisource Environments Using Source Separation , 2011 .
[57] Cédric Richard,et al. Abnormal events detection using unsupervised One-Class SVM - Application to audio surveillance and evaluation - , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).
[58] Nobutaka Ono,et al. Auxiliary-function-based independent vector analysis with power of vector-norm type weighting functions , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.
[59] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[60] Björn W. Schuller,et al. Large-scale audio feature extraction and SVM for acoustic scene classification , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[61] Hirokazu Kameoka,et al. Multichannel Extensions of Non-Negative Matrix Factorization With Complex-Valued Data , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[62] Masataka Goto,et al. Infinite Positive Semidefinite Tensor Factorization for Source Separation of Mixture Signals , 2013, ICML.
[63] Ning Ma,et al. The PASCAL CHiME speech separation and recognition challenge , 2013, Comput. Speech Lang..
[64] Jon Barker,et al. The second ‘CHiME’ speech separation and recognition challenge: An overview of challenge systems and outcomes , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.
[65] Jordi Janer,et al. Sound Retrieval From Voice Imitation Queries In Collaborative Databases , 2014, Semantic Audio.
[66] Sungzoon Cho,et al. Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .
[67] Huy Phan,et al. Random Regression Forests for Acoustic Event Detection and Classification , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[68] Karol J. Piczak. Environmental sound classification with convolutional neural networks , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).
[69] Jon Barker,et al. Chime-home: A dataset for sound source recognition in a domestic environment , 2015, 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[70] Gaël Richard,et al. HOG and subband power distribution image features for acoustic scene classification , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).
[71] Dan Stowell,et al. Detection and Classification of Acoustic Scenes and Events , 2015, IEEE Transactions on Multimedia.
[72] Toni Heittola,et al. IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events SOUND EVENT DETECTION FOR OFFICE LIVE AND OFFICE SYNTHETIC AASP CHALLENGE , 2015 .
[73] Hirokazu Kameoka,et al. Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[74] Alain Rakotomamonjy,et al. Histogram of gradients of Time-Frequency Representations for Audio scene detection , 2015, ArXiv.
[75] Suehiro Shimauchi,et al. Acoustic Scene Analysis Based on Hierarchical Generative Model of Acoustic Event Sequence , 2016, IEICE Trans. Inf. Syst..
[76] Annamaria Mesaros,et al. Metrics for Polyphonic Sound Event Detection , 2016 .
[77] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[78] Justin Salamon,et al. The Implementation of Low-cost Urban Acoustic Monitoring Devices , 2016, ArXiv.
[79] Gaël Richard,et al. Acoustic scene classification with matrix factorization for unsupervised feature learning , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[80] Reishi Kondo,et al. Acoustic event detection based on non-negative matrix factorization with mixtures of local dictionaries and activation aggregation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[81] Antonio Torralba,et al. SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.
[82] Reishi Kondo,et al. Acoustic Event Detection Method Using Semi-Supervised Non-Negative Matrix Factorization with Mixtures of Local Dictionaries , 2016, DCASE.
[83] Hirokazu Kameoka,et al. Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[84] Heikki Huttunen,et al. Recurrent neural networks for polyphonic sound event detection in real life recordings , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[85] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[86] Ankit Shah,et al. DCASE2017 Challenge Setup: Tasks, Datasets and Baseline System , 2017, DCASE.
[87] Jon Barker,et al. An analysis of environment, microphone and data simulation mismatches in robust speech recognition , 2017, Comput. Speech Lang..
[88] Tomoki Toda,et al. Stereophonic music separation based on non-negative tensor factorization with cepstrum regularization , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).
[89] Kyogu Lee,et al. Ensemble of Convolutional Neural Networks for Weakly-supervised Sound Event Detection Using Multiple Scale Input , 2017, DCASE.
[90] Aren Jansen,et al. CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[91] Aren Jansen,et al. Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[92] Tillman Weyde,et al. Singing Voice Separation with Deep U-Net Convolutional Networks , 2017, ISMIR.
[93] Takeshi Yamada,et al. Ego Noise Reduction for Hose-Shaped Rescue Robot Combining Independent Low-Rank Matrix Analysis and Multichannel Noise Cancellation , 2016, LVA/ICA.
[94] Andrew Zisserman,et al. Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[95] Jon Barker,et al. The third 'CHiME' speech separation and recognition challenge: Analysis and outcomes , 2017, Comput. Speech Lang..
[96] Wei Dai,et al. Very deep convolutional neural networks for raw waveforms , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[97] Tomoki Toda,et al. Duration-Controlled LSTM for Polyphonic Sound Event Detection , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[98] Nobutaka Ono,et al. Spatial Cepstrum as a Spatial Feature Using a Distributed Microphone Array for Acoustic Scene Analysis , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[99] Qiang Huang,et al. Attention and Localization Based on a Deep Convolutional Recurrent Model for Weakly Supervised Audio Tagging , 2017, INTERSPEECH.
[100] Mans Hulden,et al. Sound Analogies with Phoneme Embeddings , 2018 .
[101] Shinnosuke Takamichi,et al. Independent Deeply Learned Matrix Analysis for Multichannel Audio Source Separation , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).
[102] Kunio Kashino,et al. Generating Sound Words from Audio Signals of Acoustic Events with Sequence-to-Sequence Model , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[103] Keisuke Imoto,et al. Introduction to acoustic event and scene analysis , 2018 .
[104] Tomoki Toda,et al. Anomalous Sound Event Detection Based on WaveNet , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).
[105] Zhong-Qiu Wang,et al. Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[106] Jon Barker,et al. The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines , 2018, INTERSPEECH.
[107] Li Li,et al. Semi-blind source separation with multichannel variational autoencoder , 2018, ArXiv.
[108] Mathieu Lagrange,et al. Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[109] Tomoki Toda,et al. Connectionist Temporal Classification-based Sound Event Encoder for Converting Sound Events into Onomatopoeic Representations , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).
[110] Li Li,et al. Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).