论文信息 - Local Feature Extractors Accelerating HNNP for Phoneme Recognition

Local Feature Extractors Accelerating HNNP for Phoneme Recognition

Artificial neural networks are fast in the application phase but very slow in the training phase. On the other hand there are state-of-the-art approaches using neural networks, which are very efficient in image classification tasks, like the hybrid neural network plait (HNNP) approach for images from signal data stemming for instance from phonemes. We propose to accelerate HNNP for phoneme recognition by substituting the neural network with the highest computation costs, the convolutional neural network, within the HNNP by a preceding local feature extractor and a simpler and faster neural network. Hence, in this paper we propose appropriate feature extractors for this problem and investigate and compare the resulting computation costs as well as the classification performance. The results of our experiments show that HNNP with the best one of our proposed feature extractors in combination with a smaller neural network is more than two times faster than HNNP with the more complex convolutional neural network and delivers still a good classification performance.

[1] Lars Schmidt-Thieme,et al. Buried pipe localization using an iterative geometric clustering on GPR data , 2014, Artificial Intelligence Review.

[2] N. Senthilkumaran,et al. Image Segmentation - A Survey of Soft Computing Approaches , 2009, 2009 International Conference on Advances in Recent Technologies in Communication and Computing.

[3] G. Pettengill,et al. Magellan: Radar Performance and Data Products , 1991, Science.

[4] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[5] Erkki Oja,et al. Probabilistic and non-probabilistic Hough transforms: overview and comparisons , 1995, Image Vis. Comput..

[6] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[7] Abdesselam Bouzerdoum,et al. A Shunting Inhibitory Convolutional Neural Network for Gender Classification , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[8] Djemel Ziou,et al. Edge Detection Techniques-An Overview , 1998 .

[9] Masakazu Matsugu,et al. Subject independent facial expression recognition with robust face detection using a convolutional neural network , 2003, Neural Networks.

[10] Lars Schmidt-Thieme,et al. HNNP - A Hybrid Neural Network Plait for Improving Image Classification with Additional Side Information , 2013, 2013 IEEE 25th International Conference on Tools with Artificial Intelligence.

[11] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[12] Lars Schmidt-Thieme,et al. GamRec: A Clustering Method Using Geometrical Background Knowledge for GPR Data Preprocessing , 2012, AIAI.

[13] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14] Lars Schmidt-Thieme,et al. Automatic Subclasses Estimation for a Better Classification with HNNP , 2014, ISMIS.

[15] Shusaku Tsumoto,et al. Foundations of Intelligent Systems, 15th International Symposium, ISMIS 2005, Saratoga Springs, NY, USA, May 25-28, 2005, Proceedings , 2005, ISMIS.

[16] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..