Sound event recognition in unstructured environments using spectrogram image processing
暂无分享,去创建一个
[1] Thomas S. Huang,et al. Real-world acoustic event detection , 2010, Pattern Recognit. Lett..
[2] Joseph L. Mundy,et al. Object Recognition in the Geometric Era: A Retrospective , 2006, Toward Category-Level Object Recognition.
[3] Herman J. M. Steeneken,et al. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..
[4] S. M. Potirakis,et al. Natural soundscapes and identification of environmental sounds: A pattern recognition approach , 2009, 2009 16th International Conference on Digital Signal Processing.
[5] Chin-Hui Lee,et al. Improvements in connected digit recognition using higher order spectral and energy features , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[6] G. Kramer. Auditory Scene Analysis: The Perceptual Organization of Sound by Albert Bregman (review) , 2016 .
[7] Masataka Goto,et al. Gradient-based musical feature extraction based on scale-invariant feature transform , 2011, 2011 19th European Signal Processing Conference.
[8] DeLiang Wang,et al. Auditory Segmentation Based on Onset and Offset Analysis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Vitoantonio Bevilacqua,et al. A face recognition system based on Pseudo 2D HMM applied to neural network coefficients , 2008, Soft Comput..
[10] C. Köppl,et al. Coding of Sound Pressure Level in the Barn Owl's Auditory Nerve , 1999, The Journal of Neuroscience.
[11] David A. Ross,et al. Survey and Evaluation of Audio Fingerprinting Schemes for Mobile Query-by-Example Applications , 2011, ISMIR.
[12] Frank Kurth,et al. Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring , 2010, Pattern Recognit. Lett..
[13] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .
[14] N N Shankar,et al. Parts based representation for pedestrian using NMF with robustness to partial occlusion , 2010, 2010 International Conference on Signal Processing and Communications (SPCOM).
[15] Richard O. Duda,et al. Use of the Hough transformation to detect lines and curves in pictures , 1972, CACM.
[16] Bhiksha Raj,et al. Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors , 2012, IEEE Signal Processing Magazine.
[17] Jonathan Z. Simon,et al. Robust Spectrotemporal Reverse Correlation for the Auditory System: Optimizing Stimulus Design , 2000, Journal of Computational Neuroscience.
[18] DeLiang Wang,et al. A computational auditory scene analysis system for speech segregation and robust speech recognition , 2010, Comput. Speech Lang..
[19] Emmanuel Deruty,et al. Sound Indexing Using Morphological Description , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[20] Y.K. Muthusamy,et al. Reviewing automatic language identification , 1994, IEEE Signal Processing Magazine.
[22] T. Andringa,et al. Sound event recognition through expectancy-based evaluation ofsignal-driven hypotheses , 2010, Pattern Recognit. Lett..
[23] Combining Speech Fragment Decoding and Adaptive Noise Floor Modeling , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[24] DeLiang Wang,et al. On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.
[25] Mingjing Li,et al. Color texture moments for content-based image retrieval , 2002, Proceedings. International Conference on Image Processing.
[26] C.-C. Jay Kuo,et al. Where am I? Scene Recognition for Mobile Robots using Audio Features , 2006, 2006 IEEE International Conference on Multimedia and Expo.
[27] Hynek Hermansky,et al. TRAPS - classifiers of temporal patterns , 1998, ICSLP.
[28] C.-C. Jay Kuo,et al. Content/context-adaptive feature selection for environmental sound recognition , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.
[29] Peng Li,et al. Monaural speech separation based on MAXVQ and CASA for robust speech recognition , 2010, Comput. Speech Lang..
[30] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..
[31] DeLiang Wang,et al. An auditory-based feature for robust speech recognition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] Douglas D. O'Shaughnessy. Speech Communications: Human and Machine , 2012 .
[33] Luo Juan,et al. A comparison of SIFT, PCA-SIFT and SURF , 2009 .
[34] Janto Skowronek,et al. Automatic surveillance of the acoustic activity in our living environment , 2005, 2005 IEEE International Conference on Multimedia and Expo.
[35] Thomas Sikora,et al. How Efficient is MPEG-7 for General Sound Recognition? , 2004 .
[36] Cordelia Schmid,et al. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).
[37] Jindong Liu,et al. Mobile robot broadband sound localisation using a biologically inspired spiking neural network , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[38] Samantha J Barry,et al. The automatic recognition and counting of cough , 2006, Cough.
[39] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[40] M. Casey,et al. MPEG-7 sound-recognition tools , 2001, IEEE Trans. Circuits Syst. Video Technol..
[41] Gabriela Csurka,et al. Visual categorization with bags of keypoints , 2002, eccv 2004.
[42] Zheru Chi,et al. Improvement of Image Classification Using Wavelet Coefficients with Structured-Based Neural Network , 2008, Int. J. Neural Syst..
[43] H WittenIan,et al. The WEKA data mining software , 2009 .
[44] C. H. Chen. Pattern recognition applications in underwater acoustics , 1984 .
[45] George Tzanetakis,et al. Multifeature audio segmentation for browsing and annotation , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).
[46] DeLiang Wang,et al. A model for multitalker speech perception. , 2008, The Journal of the Acoustical Society of America.
[47] Kamil Behun. Image features in music style recognition , 2012 .
[48] Haizhou Li,et al. Jump Function Kolmogorov for overlapping audio event classification , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Patrik O. Hoyer,et al. Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..
[50] Yoshitaka Nakajima,et al. Auditory Scene Analysis: The Perceptual Organization of Sound Albert S. Bregman , 1992 .
[51] Antonio Torralba,et al. Sharing features: efficient boosting procedures for multiclass object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[52] Samy Bengio,et al. A Discriminative Approach for the Retrieval of Images from Text Queries , 2006, ECML.
[53] Barry Arons,et al. A Review of The Cocktail Party Effect , 1992 .
[54] Sam T. Roweis,et al. Factorial models and refiltering for speech separation and denoising , 2003, INTERSPEECH.
[55] Seiichi Uchida,et al. A Survey of Elastic Matching Techniques for Handwritten Character Recognition , 2005, IEICE Trans. Inf. Syst..
[56] Phil D. Green,et al. Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..
[57] Vesa T. Peltonen,et al. Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[58] Derek Hoiem,et al. SOLAR: sound object localization and retrieval in complex audio environments , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[59] Andrey Temko,et al. Fuzzy integral based information fusion for classification of highly confusable non-speech sounds , 2008, Pattern Recognit..
[60] H. Sompolinsky,et al. The tempotron: a neuron that learns spike timing–based decisions , 2006, Nature Neuroscience.
[61] Ben P. Milner,et al. Acoustic environment classification , 2006, TSLP.
[62] Jeffrey R Binder,et al. Human brain regions involved in recognizing environmental sounds. , 2004, Cerebral cortex.
[63] Ching-Yung Lin,et al. Healthcare audio event classification using Hidden Markov Models and Hierarchical Hidden Markov Models , 2009, 2009 IEEE International Conference on Multimedia and Expo.
[64] Ralf Schlüter,et al. Non-stationary feature extraction for automatic speech recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[65] Bhiksha Raj,et al. Spectrographic seam patterns for discriminative word spotting , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[66] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[67] Michael Kleinschmidt,et al. Localized spectro-temporal features for automatic speech recognition , 2003, INTERSPEECH.
[68] Michael S. Lewicki,et al. Efficient coding of natural sounds , 2002, Nature Neuroscience.
[69] Moncef Gabbouj,et al. MUVIS: A Content-Based Indexing and Retrieval System for Image and Video Databases , 1999 .
[70] Andrey Temko,et al. Acoustic event detection in meeting-room environments , 2009, Pattern Recognit. Lett..
[71] Daniel P. W. Ellis,et al. Fingerprinting to Identify Repeated Sound Events in Long-Duration Personal Audio Recordings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[72] Douglas A. Reynolds,et al. An overview of automatic speaker recognition technology , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[73] Martin Cooke,et al. A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.
[74] S A Shamma,et al. Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. , 2001, Journal of neurophysiology.
[75] Michael I. Jordan,et al. Factorial Hidden Markov Models , 1995, Machine Learning.
[76] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.
[77] Danijel Skocaj,et al. Robust recognition and pose determination of 3-D objects using range images in eigenspace approach , 2001, Proceedings Third International Conference on 3-D Digital Imaging and Modeling.
[78] Benjamin Peter Milner,et al. Speech recognition in adverse environments , 1994 .
[79] Satoshi Nakamura,et al. Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition , 2000, LREC.
[80] Hossein Najaf-Zadeh,et al. Auditory-inspired sparse representation of audio signals , 2011, Speech Commun..
[81] Hideyuki Tamura,et al. Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.
[82] Richard M. Stern,et al. A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition , 2004, Speech Commun..
[83] Monson H. Hayes,et al. Hidden Markov models for face recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[84] Shrikanth Narayanan,et al. Environmental Sound Recognition With Time–Frequency Audio Features , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[85] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[86] Richard F. Lyon,et al. On the importance of time—a temporal representation of sound , 1993 .
[87] Subhransu Maji,et al. Object detection using a max-margin Hough transform , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[88] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[89] Ghulam Muhammad,et al. Environment Recognition from Audio Using MPEG-7 Features , 2009, 2009 Fourth International Conference on Embedded and Multimedia Computing.
[90] Daniel Patrick Whittlesey Ellis,et al. Prediction-driven computational auditory scene analysis , 1996 .
[91] Noboru Ohnishi,et al. Building ears for robots: Sound localization and separation , 1997, Artificial Life and Robotics.
[92] R.D. Dony,et al. Audio Environment Classication for Hearing Aids using Artificial Neural Networks with Windowed Input , 2007, 2007 IEEE Symposium on Computational Intelligence in Image and Signal Processing.
[93] Andrey Temko,et al. Classification of meeting-room acoustic events with support vector machines and variable-feature-set clustering , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[94] Chloé Clavel,et al. Events Detection for an Audio-Based Surveillance System , 2005, 2005 IEEE International Conference on Multimedia and Expo.
[95] Sylvain Marchand,et al. THE HOUGH TRANSFORM FOR BINAURAL SOURCE LOCALIZATION , 2009 .
[96] Jean Paul Haton,et al. On noise masking for automatic missing data speech recognition: A survey and discussion , 2007, Comput. Speech Lang..
[97] Thomas C. Walters. Auditory-based processing of communication sounds , 2011 .
[98] Gerhard Rigoll,et al. Recognition of JPEG compressed face images based on statistical methods , 2000, Image Vis. Comput..
[99] Lie Lu,et al. Content analysis for audio classification and segmentation , 2002, IEEE Trans. Speech Audio Process..
[100] Björn W. Schuller,et al. Semi-supervised learning helps in sound event classification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[101] Th. Beth,et al. ANALYSIS OF DRILL SOUND IN SPINE SURGERY , 2004 .
[102] Joseph Picone,et al. Signal modeling techniques in speech recognition , 1993, Proc. IEEE.
[103] Chng Eng Siong,et al. Image Feature Representation of the Subband Power Distribution for Robust Sound Event Classification , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[104] Andrew Zisserman,et al. A Boundary-Fragment-Model for Object Detection , 2006, ECCV.
[105] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[106] P. Roth,et al. SURVEY OF APPEARANCE-BASED METHODS FOR OBJECT RECOGNITION , 2008 .
[107] Wai C. Chu,et al. Speech Coding Algorithms , 2003 .
[108] Tony Ezzat,et al. Discriminative word-spotting using ordered spectro-temporal patch features , 2008, SAPA@INTERSPEECH.
[109] Koen E. A. van de Sande,et al. Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[110] Christopher J. C. Burges,et al. A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.
[111] Isabel Trancoso,et al. Detecting audio events for semantic video search , 2009, INTERSPEECH.
[112] Christopher Heil,et al. Continuous and Discrete Wavelet Transforms , 1989, SIAM Rev..
[113] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[114] Pierre Divenyi. Speech Separation by Humans and Machines , 2004 .
[115] Hrishikesh Deshpande,et al. CLASSIFICATION OF MUSIC SIGNALS IN THE VISUAL DOMAIN , 2001 .
[116] Tao Zhang,et al. Evaluation of sound classification algorithms for hearing aid applications , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[117] Haizhou Li,et al. Image Representation of the Subband Power Distribution for Robust Sound Classification , 2011, INTERSPEECH.
[118] R. K. Reddy,et al. Categorization of environmental sounds , 2009, Biological Cybernetics.
[119] Shu-Yuan Chen,et al. Image classification using color, texture and regions , 2003, Image Vis. Comput..
[120] Yali Amit,et al. Robust acoustic object detection. , 2005, The Journal of the Acoustical Society of America.
[121] Glenn Fung,et al. Proximal support vector machine classifiers , 2001, KDD '01.
[122] M. Turk,et al. Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.
[123] Nikos Fakotakis,et al. On acoustic surveillance of hazardous situations , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[124] Tony Ezzat,et al. Localized spectro-temporal cepstral analysis of speech , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[125] L.-H. Chen,et al. Colour image retrieval based on primitives of colour moments , 2002 .
[126] Roger K. Moore,et al. Hidden Markov model decomposition of speech and noise , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[127] Lars Kai Hansen,et al. Temporal Feature Integration for Music Genre Classification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[128] Richard E. Turner. Statistical models for natural sounds , 2010 .
[129] Mathieu Lagrange,et al. Polyphonic Instrument Recognition Using Spectral Clustering , 2007, ISMIR.
[130] Ning Ma,et al. Speech fragment decoding techniques for simultaneous speaker identification and speech recognition , 2010, Comput. Speech Lang..
[131] B. Schiele,et al. Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .
[132] S. Govindarajulu,et al. A Comparison of SIFT, PCA-SIFT and SURF , 2012 .
[133] E. Coyle,et al. Onset based audio segmentation for the Irish tin whistle , 2004, Proceedings 7th International Conference on Signal Processing, 2004. Proceedings. ICSP '04. 2004..
[134] Shumeet Baluja,et al. Audio Fingerprinting: Combining Computer Vision & Data Stream Processing , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[135] Michael J. Black,et al. EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.
[136] Jinhai Cai,et al. Sensor Network for the Monitoring of Ecosystem: Bird Species Recognition , 2007, 2007 3rd International Conference on Intelligent Sensors, Sensor Networks and Information.
[137] Richard M. Stern,et al. Robust Speech Recognition: The case for restoring missing features , 2001 .
[138] C. H. Chen. Recognition of underwater transient patterns , 1985, Pattern Recognit..
[139] Francesc Alías,et al. Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification , 2012, IEEE Transactions on Multimedia.
[140] Guillaume Lemaitre,et al. Real-Time Detection of Overlapping Sound Events with Non-Negative Matrix Factorization , 2013 .
[141] Cordelia Schmid,et al. Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.
[142] Mark J. F. Gales,et al. Model-based techniques for noise robust speech recognition , 1995 .
[143] I. Paraskevas,et al. Audio classification using acoustic images for retrieval from multimedia databases , 2003, Proceedings EC-VIP-MC 2003. 4th EURASIP Conference focused on Video/Image Processing and Multimedia Communications (IEEE Cat. No.03EX667).
[144] Kenneth Thomas Schutte,et al. Parts-based models and local features for automatic speech recognition , 2009 .
[145] Horst Bischof,et al. Dealing with occlusions in the eigenspace approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[146] C.-C. Jay Kuo,et al. Environmental sound recognition using MP-based features , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[147] Nobuyuki Miyake,et al. Noise Detection and Classification in Speech Signals with Boosting , 2007, 2007 IEEE/SP 14th Workshop on Statistical Signal Processing.
[148] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[149] Martin Heckmann,et al. A hierarchical framework for spectro-temporal feature extraction , 2011, Speech Commun..
[150] Haibo Li,et al. Simple 1D Discrete Hidden Markov Models for Face Recognition , 2003, VLBV.
[151] Douglas A. Reynolds,et al. Approaches and applications of audio diarization , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[152] Nima Mesgarani,et al. Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[153] David A. Cieslak,et al. Hellinger distance decision trees are robust and skew-insensitive , 2011, Data Mining and Knowledge Discovery.
[154] Nikos Fakotakis,et al. Automatic Recognition of an Unknown and Time-Varying Number of Simultaneous Environmental Sound Sources , 2011 .
[155] Jont B. Allen,et al. How do humans process and recognize speech? , 1993, IEEE Trans. Speech Audio Process..
[156] Richard F. Lyon,et al. Machine Hearing: An Emerging Field , 2010 .
[157] Stefano Soatto,et al. Dynamic Textures , 2003, International Journal of Computer Vision.
[158] Stuart J. Russell,et al. Dynamic bayesian networks: representation, inference and learning , 2002 .
[159] Tomohiro Nakatani,et al. Making Machines Understand Us in Reverberant Rooms: Robustness Against Reverberation for Automatic Speech Recognition , 2012, IEEE Signal Process. Mag..
[160] Douglas D. O'Shaughnessy,et al. Invited paper: Automatic speech recognition: History, methods and challenges , 2008, Pattern Recognit..
[161] Renate Sitte,et al. Comparison of techniques for environmental sound recognition , 2003, Pattern Recognit. Lett..
[162] Andrey Temko,et al. ACOUSTIC EVENT DETECTION AND CLASSIFICATION IN SMART-ROOM ENVIRONMENTS: EVALUATION OF CHIL PROJECT SYSTEMS , 2006 .
[163] Douglas Keislar,et al. Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..
[164] Miroslaw Bober,et al. MPEG-7 visual shape descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..
[165] Alain Dufaux. Detection and Recognition of Impulsive Sound Signals , 2001 .
[166] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.
[167] John Midgley,et al. Probabilistic eigenspace object recognition in the presence of occlusion , 2001 .
[168] Haizhou Li,et al. A first speech recognition system for Mandarin-English code-switch conversational speech , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[169] Derek Hoiem,et al. Computer vision for music identification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[170] A.S.A. Mohamed,et al. Recognition of heart sounds and murmurs for cardiac diagnosis , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.
[171] Cordelia Schmid,et al. Accurate Object Detection with Deformable Shape Models Learnt from Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[172] Cordelia Schmid,et al. Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, CVPR Workshops.
[173] Tetsuya Takiguchi,et al. Gradient-based acoustic features for speech recognition , 2009, 2009 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS).
[174] Mandy Eberhart,et al. Speech Communications Human And Machine , 2016 .
[175] Peter E. Hart,et al. Experiments in Scene Analysis , 1970 .
[176] Augusto Sarti,et al. Scream and gunshot detection in noisy environments , 2007, 2007 15th European Signal Processing Conference.
[177] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[178] Dana H. Ballard,et al. Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..
[179] Takeo Kanade,et al. Object Detection Using the Statistics of Parts , 2004, International Journal of Computer Vision.
[180] N. A. Thacker,et al. Tutorial: Algorithms For 2-Dimensional Object Recognition. , 1996 .
[181] Gaël Richard,et al. Temporal Integration for Audio Classification With Application to Musical Instrument Classification , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[182] R. Christopher deCharms,et al. Primary cortical representation of sounds by the coordination of action-potential timing , 1996, Nature.
[183] Miguel Á. Carreira-Perpiñán,et al. Mode-Finding for Mixtures of Gaussian Distributions , 2000, IEEE Trans. Pattern Anal. Mach. Intell..
[184] Renate Sitte,et al. Analysis of Speech Recognition Techniques for use in a Non-Speech Sound Recognition System , 2002 .
[185] Tetsuya Ogata,et al. Effects of modelling within- and between-frame temporal variations in power spectra on non-verbal sound recognition , 2010, INTERSPEECH.
[186] Brian Gygi,et al. Similarity and categorization of environmental sounds , 2007, Perception & psychophysics.
[187] Martial Michel,et al. The CLEAR 2007 Evaluation , 2007, CLEAR.
[188] F. Beritelli,et al. A pattern recognition system for environmental sound classification based on MFCCs and neural networks , 2008, 2008 2nd International Conference on Signal Processing and Communication Systems.
[189] Avery Wang,et al. An Industrial Strength Audio Search Algorithm , 2003, ISMIR.
[190] Haizhou Li,et al. Normalization of the Speech Modulation Spectra for Robust Speech Recognition , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[191] Andreas Spanias,et al. Segmentation, Indexing, and Retrieval for Environmental and Natural Sounds , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[192] G. Mangun,et al. Tonotopy in human auditory cortex examined with functional magnetic resonance imaging , 1997, Human brain mapping.
[193] Massimo Minervini,et al. Nonnegative Matrix Factorizations Performing Object Detection and Localization , 2012, Appl. Comput. Intell. Soft Comput..
[194] Enzo Mumolo,et al. Algorithms for acoustic localization based on microphone array in service robotics , 2003, Robotics Auton. Syst..
[195] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .
[196] Panu Somervuo,et al. Parametric Representations of Bird Sounds for Automatic Species Recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[197] Chng Eng Siong,et al. Overlapping sound event recognition using local spectrogram features and the generalised hough transform , 2013, Pattern Recognit. Lett..
[198] Taras Butko,et al. Acoustic Event Detection Based on Feature-Level Fusion of Audio and Video Modalities , 2011, EURASIP J. Adv. Signal Process..
[199] Haizhou Li,et al. Temporal coding of local spectrogram features for robust sound recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[200] Leszek Cieplinski. MPEG-7 Color Descriptors and Their Applications , 2001, CAIP.
[201] Rainer Stiefelhagen,et al. Computers in the Human Interaction Loop , 2009, Human-Computer Interaction Series.
[202] Chidchanok Lursinsap,et al. Impulsive Environment Sound Detection by Neural Classification of Spectrogram and Mel-Frequency Coefficient Images , 2010 .
[203] Michael J. Swain,et al. Color indexing , 1991, International Journal of Computer Vision.
[204] Alfred Mertins,et al. Analysis and design of gammatone signal models. , 2009, The Journal of the Acoustical Society of America.
[205] V. Kshirsagar,et al. Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.
[206] Tuomas Virtanen,et al. Acoustic event detection in real life recordings , 2010, 2010 18th European Signal Processing Conference.
[207] Malcolm Slaney,et al. An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank , 1997 .
[208] Powen Ru,et al. Multiresolution spectrotemporal analysis of complex sounds. , 2005, The Journal of the Acoustical Society of America.
[209] H. Sompolinsky,et al. Time-Warp–Invariant Neuronal Processing , 2009, PLoS biology.
[210] Dan Roth,et al. Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[211] Haizhou Li,et al. Sound Event Recognition With Probabilistic Distance SVMs , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[212] Seungjin Choi,et al. Nonnegative features of spectro-temporal sounds for classification , 2005, Pattern Recognit. Lett..
[213] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[214] M. Kleinschmidt. Methods for capturing spectro-temporal modulations in automatic speech recognition , 2001 .
[215] Jitendra Malik,et al. Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[216] J. Pickles. An Introduction to the Physiology of Hearing , 1982 .
[217] Sridhar Krishnan,et al. Time–Frequency Matrix Feature Extraction and Classification of Environmental Audio Signals , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[218] Trevor Darrell,et al. The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[219] Andrey Temko,et al. Acoustic Event Detection and Classification , 2007, Computers in the Human Interaction Loop.
[220] Haizhou Li,et al. Spectrogram Image Feature for Sound Event Classification in Mismatched Conditions , 2011, IEEE Signal Processing Letters.
[221] J J Hopfield,et al. What is a moment? Transient synchrony as a collective mechanism for spatiotemporal integration. , 2001, Proceedings of the National Academy of Sciences of the United States of America.
[222] Juan Manuel Górriz,et al. Voice Activity Detection. Fundamentals and Speech Recognition System Robustness , 2007 .
[223] R.M. Stern,et al. Missing-feature approaches in speech recognition , 2005, IEEE Signal Processing Magazine.
[224] Jean-Jacques E. Slotine,et al. Audio classification from time-frequency texture , 2008, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[225] Leonidas J. Guibas,et al. The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.
[226] Tomi Kinnunen,et al. Audio context recognition in variable mobile environments from short segments using speaker and language recognizers , 2012, Odyssey.
[227] Andrey Temko,et al. CLEAR Evaluation of Acoustic Event Detection and Classification Systems , 2006, CLEAR.
[228] M. Basseville. Distance measures for signal processing and pattern recognition , 1989 .
[229] Gy Kovács,et al. Localized spectro-temporal features for noise-robust speech recognition , 2010, 2010 International Joint Conference on Computational Cybernetics and Technical Informatics.
[230] Andrew Blake,et al. Multiscale Categorical Object Recognition Using Contour Fragments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[231] Daniel P. W. Ellis,et al. Decoding speech in the presence of other sources , 2005, Speech Commun..
[232] David Gerhard,et al. Audio Signal Classification: History and Current Techniques , 2003 .
[233] James R. Glass,et al. Speech recognition with localized time-frequency pattern detectors , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[234] Haizhou Li,et al. Selective gammatone filterbank feature for robust sound event recognition , 2010, INTERSPEECH.
[235] Klaus Obermayer,et al. Classification Schemes for Step Sounds Based on Gammatone-Filters , 2007, NIPS 2007.
[236] Subhransu Maji,et al. Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[237] Annamaria Mesaros,et al. Sound Event Detection in Multisource Environments Using Source Separation , 2011 .
[238] Michael A. Cowling,et al. Non-Speech Environmental Sound Classification System for Autonomous Surveillance , 2004 .
[239] David V. Anderson,et al. Audio classification and scene recognition and for hearing aids , 2005, 2005 IEEE International Symposium on Circuits and Systems.
[240] Douglas OʼShaughnessy. Formant Estimation and Tracking , 2008 .
[241] Bernt Schiele,et al. Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.
[242] Steve J. Young,et al. HMM-based architecture for face identification , 1994, Image Vis. Comput..
[243] Vesa T. Peltonen,et al. Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[244] Shao-Hu Peng,et al. A visual shape descriptor using sectors and shape context of contour lines , 2010, Inf. Sci..
[245] Luc Van Gool,et al. Fast PRISM: Branch and Bound Hough Transform for Object Class Detection , 2011, International Journal of Computer Vision.
[246] R. Meddis. Simulation of mechanical to neural transduction in the auditory receptor. , 1986, The Journal of the Acoustical Society of America.
[247] Jean-Sebastien Legare,et al. Face Recognition : Robustness of the ‘ Eigenface ’ Approach , 2005 .
[248] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.
[249] Alfred Mertins,et al. Automatic speech recognition and speech variability: A review , 2007, Speech Commun..
[250] Taras Butko,et al. Feature selection for multimodal: acoustic event detection , 2011 .
[251] François Pachet,et al. Exploring Billions of Audio Features , 2007, 2007 International Workshop on Content-Based Multimedia Indexing.
[252] Eli Shechtman,et al. In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[253] Brian R Glasberg,et al. Derivation of auditory filter shapes from notched-noise data , 1990, Hearing Research.
[254] Wilhelm Burger,et al. Digital Image Processing - An Algorithmic Introduction using Java , 2008, Texts in Computer Science.
[255] Longbiao Wang,et al. Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM , 2007, Speech Commun..
[256] Ben Pinkowski. A Template-Based Approach for Recognition of Intermittent Sounds , 1989, Great Lakes Computer Science Conference.
[257] Björn W. Schuller,et al. Audio recognition in the wild: Static and dynamic classification on a real-world database of animal vocalizations , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).