Statistical Testing of Segment Homogeneity in Classification of Piecewise–Regular Objects

Abstract The paper is focused on the problem of multi-class classification of composite (piecewise-regular) objects (e.g., speech signals, complex images, etc.). We propose a mathematical model of composite object representation as a sequence of independent segments. Each segment is represented as a random sample of independent identically distributed feature vectors. Based on this model and a statistical approach, we reduce the task to a problem of composite hypothesis testing of segment homogeneity. Several nearest-neighbor criteria are implemented, and for some of them the well-known special cases (e.g., the Kullback–Leibler minimum information discrimination principle, the probabilistic neural network) are highlighted. It is experimentally shown that the proposed approach improves the accuracy when compared with contemporary classifiers.

[1]  Shengcai Liao,et al.  Learning Multi-scale Block Local Binary Patterns for Face Recognition , 2007, ICB.

[2]  Andrey V. Savchenko,et al.  Phonetic words decoding software in the problem of Russian speech recognition , 2013, Automation and Remote Control.

[3]  Eric P. Xing,et al.  Nonextensive entropic kernels , 2008, ICML '08.

[4]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[5]  Farzad Towhidkhah,et al.  Audio-visual speaker identification using dynamic facial movements and utterance phonetic content , 2011, Appl. Soft Comput..

[6]  Thilo Pfau,et al.  Estimating the speaking rate by vowel detection , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  Yifan Gong,et al.  Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Sergios Theodoridis,et al.  Pattern Recognition, Fourth Edition , 2008 .

[9]  Jerzy Sas,et al.  Pipelined language model construction for Polish speech recognition , 2013, Int. J. Appl. Math. Comput. Sci..

[10]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Françoise Fogelman-Soulié,et al.  Speaker-independent isolated digit recognition: Multilayer perceptrons vs. Dynamic time warping , 1990, Neural Networks.

[12]  R. Gray,et al.  Distortion measures for speech processing , 1980 .

[13]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[14]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[15]  Bernard Mérialdo,et al.  Multilevel Decoding for Very-Large-Size-Dictionary Speech Recognition , 1988, IBM J. Res. Dev..

[16]  Leszek Rutkowski,et al.  Computational intelligence - methods and techniques , 2008 .

[17]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[18]  Hema A. Murthy,et al.  Robust syllable segmentation and its application to syllable-centric continuous speech recognition , 2010, 2010 National Conference On Communications (NCC).

[19]  Andrey V. Savchenko,et al.  Directed enumeration method in image recognition , 2012, Pattern Recognit..

[20]  A. V. Savchenko,et al.  About neural-network algorithms application in viseme classification problem with face video in audiovisual speech recognition systems , 2014, Optical Memory and Neural Networks.

[21]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[22]  Zhi-Hua Zhou,et al.  Face recognition from a single image per person: A survey , 2006, Pattern Recognit..

[23]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[24]  Jürgen Schmidhuber,et al.  Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.

[25]  Qi Yin,et al.  Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not? , 2015, ArXiv.

[26]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[27]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[28]  Ewa Swiercz Classification in the Gabor time-frequency domain of non-stationary signals embedded in heavy noise with unknown statistical distribution , 2010, Int. J. Appl. Math. Comput. Sci..

[29]  Andrey V. Savchenko,et al.  Probabilistic neural network with homogeneity testing in recognition of discrete patterns set , 2013, Neural Networks.