Speech and Non-speech Audio Files Discrimination Extracting Textural and Acoustic Features

It is a very accessible job to discriminate speech and non-speech audio files without giving too much effort. So, providing a sort of intelligence to a machine is required for the purpose. This issue is still coming out as an area of interest among the researchers. The prime goal of this work is to introduce a novel scheme for the discrimination of speech with non-speech audio data. The proposed methodology has introduced a new approach for identifying these two categories extracting both textural and acoustic features. As all features are not so important for discriminating the audio files, so some popular feature selection methods are applied to determine the important features and finally classification algorithms are applied on the reduced audio dataset. The experimental result shows the importance of the extracted features for the application.

[1]  Kyle Jasmin,et al.  Speech rhythm measure of non-native speech using a statistical phoneme duration model , 2016 .

[2]  François Michaud,et al.  Robust speech/non-speech discrimination based on pitch estimation for mobile robots , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Martin Hagmüller,et al.  Speech/Non-Speech Detection for Electro-Larynx Speech Using EMG , 2015, BIOSIGNALS.

[4]  Sadaoki Furui,et al.  A noise-robust speech recognition approach incorporating normalized speech/non-speech likelihood into hypothesis scores , 2013, Speech Commun..

[5]  Jean-Pierre Martens,et al.  Model-based speech/non-speech segmentation of a heterogeneous multilingual TV broadcast collection , 2013, 2013 International Symposium on Intelligent Signal Processing and Communication Systems.

[6]  D. Saur,et al.  Involuntary attentional capture by speech and non-speech deviations: A combined behavioral–event-related potential study , 2013, Brain Research.

[7]  A. Harkrider,et al.  Sensorimotor activity measured via oscillations of EEG mu rhythms in speech and non-speech discrimination tasks with and without segmentation demands , 2017, Brain and Language.

[8]  Won-Ho Shin,et al.  Speech/non-speech classification using multiple features for robust endpoint detection , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  Tim Saltuklaroglu,et al.  Dynamic modulation of shared sensory and motor cortical rhythms mediates speech and non-speech discrimination performance , 2014, Front. Psychol..

[10]  Ingrid Russell,et al.  An introduction to the WEKA data mining system , 2006, ITICSE '06.

[11]  N. Jamil,et al.  Speech/non-speech detection in Malay language spontaneous speech , 2013, 2013 International Conference on Computing, Management and Telecommunications (ComManTel).

[12]  Gerald Friedland,et al.  Lost in segmentation: Three approaches for speech/non-speech detection in consumer-produced videos , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[13]  Petr Motlícek,et al.  Unsupervised Speech/Non-Speech Detection for Automatic Speech Recognition in Meeting Rooms , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[14]  P. C. Reghu Raj,et al.  Random forest algorithm for improving the performance of speech/non-speech detection , 2014, 2014 First International Conference on Computational Systems and Communications (ICCSC).

[15]  N. R. Raajan,et al.  Speech and Non-Speech Identification and Classification using KNN Algorithm , 2012 .

[16]  Riikka Möttönen,et al.  Discrimination of speech and non-speech sounds following theta-burst stimulation of the motor cortex , 2014, Front. Psychol..

[17]  Marco Baroni,et al.  Processing of speech and non-speech sounds in the supratemporal plane: Auditory input preference does not predict sensitivity to statistical structure , 2013, NeuroImage.

[18]  K. Bunton,et al.  Speech versus Nonspeech: Different Tasks, Different Neural Organization , 2008, Seminars in speech and language.

[19]  Juan Manuel Górriz,et al.  Speech/non-speech discrimination based on contextual information integrated bispectrum LRT , 2006, IEEE Signal Processing Letters.