Inclusion of manner of articulation to achieve improved phoneme classification accuracy for Bengali continuous speech

In this experiment, a phoneme classification model has been developed using a Deep Neural Network based framework. The experiment is conducted in two phases. In the first phase, phoneme classification task has been performed. The deep- structured model provided good overall classification accuracy of 87.8%. All the phonemes are classified with precision and recall values. A confusion matrix of all the Bengali phonemes is derived. Using the confusion matrix, the phonemes are classified into nine groups. These nine groups provided better overall classification accuracy of 98.7%, and a new confusion matrix is derived for this nine groups. A lower confusion rate is observed this time. In the second phase of the experiment, the nine groups are reclassified into 15 groups using the manner of articulation based knowledge and the deep-structured model is retrained. The system provided 98.9% of overall classification accuracy this time. This result is almost equal to the overall accuracy which was observed for nine groups. But as the nine groups are redivided into 15 groups, the phoneme confusion in a single group became less which leads to a better phoneme classification model.