Speaker identification through artificial intelligence techniques: A comprehensive review and research challenges

[1]  Hossein Marvi,et al.  Text-independent speaker identification based on selection of the most similar feature vectors , 2016, International Journal of Speech Technology.

[2]  T. Jayasree,et al.  Cascaded Feedforward Neural Networks for speaker identification using Perceptual Wavelet based Cepstral Coefficients , 2019, J. Intell. Fuzzy Syst..

[3]  Vijay M. Sardar,et al.  Timbre features for speaker identification of whispering speech: selection of optimal audio descriptors , 2019, International Journal of Computers and Applications.

[4]  Madasu Hanmandlu,et al.  Higher order information set based features for text-independent speaker identification , 2018, Int. J. Speech Technol..

[5]  John H. L. Hansen,et al.  Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[6]  Theodoros Giannakopoulos pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis , 2015, PloS one.

[7]  Khaled Daqrouq,et al.  Wavelet entropy and neural network for text-independent speaker identification , 2011, Eng. Appl. Artif. Intell..

[8]  Eliathamby Ambikairajah,et al.  A segment selection technique for speaker verification , 2010, Speech Commun..

[9]  Javier Hernando,et al.  Restricted Boltzmann machines for vector representation of speech in speaker recognition , 2018, Comput. Speech Lang..

[10]  Ruili Wang,et al.  Speaker identification features extraction methods: A systematic review , 2017, Expert Syst. Appl..

[11]  Ying Wah Teh,et al.  Multi-sensor fusion based on multiple classifier systems for human activity identification , 2019, Human-centric Computing and Information Sciences.

[12]  Mohammad Farukh Hashmi,et al.  Virtual home assistant for voice based controlling and scheduling with short speech speaker identification , 2018, Multimedia Tools and Applications.

[13]  Bin Ma,et al.  Text-dependent speaker verification: Classifiers, databases and RSR2015 , 2014, Speech Commun..

[14]  C. Brodley,et al.  Decision tree classification of land cover from remotely sensed data , 1997 .

[15]  Fathi E. Abd El-Samie,et al.  A Novel Speech Enhancement Method Using Fourier Series Decomposition and Spectral Subtraction for Robust Speaker Identification , 2019, Wirel. Pers. Commun..

[16]  Geoffrey Stewart Morrison,et al.  INTERPOL survey of the use of speaker identification by law enforcement agencies. , 2016, Forensic science international.

[17]  Vytautas Rudžionis,et al.  Building LSTM neural network based speaker identification system , 2018 .

[18]  Jian-Da Wu,et al.  Speaker identification using discrete wavelet packet transform technique with irregular decomposition , 2009, Expert Syst. Appl..

[19]  M. K. Gill,et al.  Vector Quantization based Speaker Identification , 2010 .

[20]  Ulrich Heute,et al.  Text-independent speaker identification system based on the histogram of DCT-cepstrum coefficients , 2012, Int. J. Knowl. Based Intell. Eng. Syst..

[21]  A. P. Dawid,et al.  Generative or Discriminative? Getting the Best of Both Worlds , 2007 .

[22]  Barry Arons,et al.  A Conversational Telephone Messaging System , 1984, IEEE Transactions on Consumer Electronics.

[23]  Osama S. Faragallah,et al.  Robust noise MKMFCC–SVM automatic speaker identification , 2018, International Journal of Speech Technology.

[24]  Diego Cabrera,et al.  Multimodal deep support vector classification with homologous features and its application to gearbox fault diagnosis , 2015, Neurocomputing.

[25]  B. Yegnanarayana,et al.  Detection of glottal closure instant and glottal open region from speech signals using spectral flatness measure , 2020, Speech Commun..

[26]  Ascensión Gallardo-Antolín,et al.  Enhancement of a text-independent speaker verification system by using feature combination and parallel structure classifiers , 2018, Neural Computing and Applications.

[27]  Jian-Da Wu,et al.  Speaker identification based on the frame linear predictive coding spectrum technique , 2009, Expert Syst. Appl..

[28]  Yucheng Yang,et al.  Mobile intelligent terminal speaker identification for real-time monitoring system of sports training , 2020 .

[29]  Sabyasachi Patra,et al.  Silence Removal and Endpoint Detection of Speech Signal for Text Independent Speaker Identification , 2014 .

[30]  Yongming Huang,et al.  Feature fusion methods research based on deep belief networks for speech emotion recognition under noise condition , 2017, Journal of Ambient Intelligence and Humanized Computing.

[31]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[32]  Yanning Zhang,et al.  Hybrid Genetic and Variational Expectation-Maximization Algorithm for Gaussian-Mixture-Model-Based Brain MR Image Segmentation , 2011, IEEE Transactions on Information Technology in Biomedicine.

[33]  Pawan K. Ajmera,et al.  Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram , 2011, Pattern Recognit..

[34]  Hye-jin Shim,et al.  Avoiding Speaker Overfitting in End-to-End DNNs Using Raw Waveform for Text-Independent Speaker Verification , 2018, INTERSPEECH.

[35]  Abbes Amira,et al.  Speaker identification using multimodal neural networks and wavelet analysis , 2015, IET Biom..

[36]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[37]  S. D. Shirbahadurkar,et al.  Speaker identification of whispering speech: an investigation on selected timbrel features and KNN distance measures , 2018, Int. J. Speech Technol..

[38]  Driss Matrouf,et al.  Forensic speaker recognition , 2009, IEEE Signal Process. Mag..

[39]  Tao Zhang,et al.  An overview of speech endpoint detection algorithms , 2020 .

[40]  R Togneri,et al.  An Overview of Speaker Identification: Accuracy and Robustness Issues , 2011, IEEE Circuits and Systems Magazine.

[41]  Linhui Sun,et al.  Deep and shallow features fusion based on deep convolutional neural network for speech emotion recognition , 2018, Int. J. Speech Technol..

[42]  Teh Ying Wah,et al.  Data fusion and multiple classifier systems for human activity detection and health monitoring: Review and open research directions , 2019, Inf. Fusion.

[43]  Wei Wang,et al.  A network model of speaker identification with new feature extraction methods and asymmetric BLSTM , 2020, Neurocomputing.

[44]  Yong Gao,et al.  Acoustic feature extraction method for robust speaker identification , 2015, Multimedia Tools and Applications.

[45]  Julian Fiérrez,et al.  Multiple classifiers in biometrics. part 1: Fundamentals and review , 2018, Inf. Fusion.

[46]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[47]  Tong Li,et al.  GMM and CNN Hybrid Method for Short Utterance Speaker Recognition , 2018, IEEE Transactions on Industrial Informatics.

[48]  Abbes Amira,et al.  Text-Independent Speaker Identification Using Vowel Formants , 2015, Journal of Signal Processing Systems.

[49]  Tarek A. Tutunji,et al.  Speaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers , 2015, Appl. Soft Comput..

[50]  Shashidhar G. Koolagudi,et al.  Neural network based feature transformation for emotion independent speaker identification , 2012, Int. J. Speech Technol..

[51]  Zrar Khalid Abdul Kurdish speaker identification based on one dimensional convolutional neural network , 2019 .

[52]  Linhui Sun,et al.  Text-independent speaker identification based on deep Gaussian correlation supervector , 2019, Int. J. Speech Technol..

[53]  Jian-Da Wu,et al.  Speaker identification system using empirical mode decomposition and an artificial neural network , 2011, Expert Syst. Appl..

[54]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[55]  Sid-Ahmed Selouani,et al.  Maximum entropy PLDA for robust speaker recognition under speech coding distortion , 2019, International Journal of Speech Technology.

[56]  Goutam Saha,et al.  Improved Text-Independent Speaker Identification using Fused MFCC and IMFCC Feature Sets based on Gaussian Filter , 2009 .

[57]  Ausif Mahmood,et al.  Review of Deep Learning Algorithms and Architectures , 2019, IEEE Access.

[58]  Diogo R. Ferreira,et al.  Preprocessing techniques for context recognition from accelerometer data , 2010, Personal and Ubiquitous Computing.

[59]  John H. L. Hansen,et al.  Speaker Identification Within Whispered Speech Audio Streams , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[60]  Abdellah Adib,et al.  Speaker Identification for OFDM-Based Aeronautical Communication System , 2019, Circuits Syst. Signal Process..

[61]  Dong Yue,et al.  Multi-View Stacking Ensemble for Power Consumption Anomaly Detection in the Context of Industrial Internet of Things , 2018, IEEE Access.

[62]  Radu Tudor Ionescu,et al.  Local Learning With Deep and Handcrafted Features for Facial Expression Recognition , 2018, IEEE Access.

[63]  D. Barone,et al.  Speaker identification using nonlinear dynamical features , 2002 .

[64]  José Antonio Camarena Ibarrola,et al.  Efficient speaker identification using spectral entropy , 2019, Multimedia Tools and Applications.

[65]  Hamed Haddadi,et al.  Deep Learning in Mobile and Wireless Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[66]  Chang-Hong Lin,et al.  Speaker Identification With Whispered Speech for the Access Control System , 2015, IEEE Transactions on Automation Science and Engineering.

[67]  Ying Wah Teh,et al.  Text-Independent Speaker Identification Through Feature Fusion and Deep Neural Network , 2020, IEEE Access.

[68]  Kittisak Kerdprasop,et al.  Text-Independent Speaker Identification Using Deep Learning Model of Convolution Neural Network , 2019, International Journal of Machine Learning and Computing.

[69]  Atsushi Nakamura Acoustic modeling for speech recognition based on a generalized Laplacian mixture distribution , 2002 .

[70]  Seyed Reza Shahamiri,et al.  A Multi-Views Multi-Learners Approach Towards Dysarthric Speech Recognition Using Multi-Nets Artificial Neural Networks , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[71]  Joseph Picone,et al.  Signal modeling techniques in speech recognition , 1993, Proc. IEEE.

[72]  Ismail Shahin,et al.  Novel cascaded Gaussian mixture model-deep neural network classifier for speaker identification in emotional talking environments , 2018, Neural Computing and Applications.

[73]  Kasiprasad Mannepalli,et al.  A novel Adaptive Fractional Deep Belief Networks for speaker emotion recognition , 2017 .

[74]  Artur S. d'Avila Garcez,et al.  Speaker recognition with hybrid features from a deep belief network , 2018, Neural Computing and Applications.

[75]  Kandarpa Kumar Sarma,et al.  Vowel Phoneme Segmentation for Speaker Identification Using an ANN-Based Framework , 2013, J. Intell. Syst..

[76]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[77]  DeLiang Wang,et al.  Robust Speaker Identification in Noisy and Reverberant Conditions , 2014, IEEE/ACM Trans. Audio, Speech & Language Processing.

[78]  K. Manikandan,et al.  Speaker Identification using a Novel Prosody with Fuzzy based Hierarchical Decision Tree Approach , 2016 .

[79]  Roland R. Draxler,et al.  Root mean square error (RMSE) or mean absolute error (MAE) , 2014 .

[80]  Michael S. Lew,et al.  Deep learning for visual understanding: A review , 2016, Neurocomputing.

[81]  Fabio Tamburini,et al.  Linguistic features and automatic classifiers for identifying mild cognitive impairment and dementia , 2021, Comput. Speech Lang..

[82]  M. Bilginer Gülmezoglu,et al.  Common vector approach and its combination with GMM for text-independent speaker recognition , 2011, Expert Syst. Appl..

[83]  Derya Avci,et al.  An expert system for speaker identification using adaptive wavelet sure entropy , 2009, Expert Syst. Appl..

[84]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[85]  Lukás Burget,et al.  Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition , 2018, Comput. Speech Lang..

[86]  Rafik A. Goubran,et al.  Robust voice activity detection using higher-order statistics in the LPC residual domain , 2001, IEEE Trans. Speech Audio Process..

[87]  Dominique Genoud,et al.  POLYCOST: A telephone-speech database for speaker recognition , 2000, Speech Commun..

[88]  Joon-Hyuk Chang,et al.  Ensemble of deep neural networks using acoustic environment classification for statistical model-based voice activity detection , 2016, Comput. Speech Lang..

[89]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.