Combining Multiple Statistical Classifiers to Improve the Accuracy of Task Classification

Task classification is an important subproblem of Spoken Language Understanding (SLU) in automated systems providing natural language user interface, whose goal is to identify the topic of a query from the user. This paper presents a combination of multiple statistical classifiers to improve the accuracy of task classification in the context of city public transportation information inquiry domain. Three different typical types of statistical classifiers are trained on the same data to be the base classifiers of the combination system: naive bayes classifier, n-gram model, and support vector machines. The combination method of two-stage classification is emplored to yield better overall performance. Our experiments showed that support vector machines outperform excessively the other base classifiers for task classification in our domain. The comparative experimental results between two-stage classification and voting strategy indicated, under the circumstance that the best base classifier has the overwhelming performance over the other base classifiers, the strategy of two-stage classification was more effective and could produce better results than the best component classifier.

[1]  Chin-Hui Lee,et al.  On natural language call routing , 2000, Speech Commun..

[2]  Zheng Liu,et al.  Comparative experiments on task classification for spoken language understanding using Naive Bayes classifier , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.

[3]  Lluís Màrquez Villodre Machine learning and natural language processing , 2000 .

[4]  Hermann Ney,et al.  On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  Walter Daelemans,et al.  Improving Accuracy in word class tagging through the Combination of Machine Learning Systems , 2001, CL.

[7]  Elmar Nöth,et al.  Dialog act classification with the help of prosody , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[8]  W. Bruce Croft,et al.  Combining classifiers in text categorization , 1996, SIGIR '96.

[9]  Brendan J. Frey,et al.  Combination of statistical and rule-based approaches for spoken language understanding , 2002, INTERSPEECH.

[10]  Renato De Mori,et al.  A mixed approach to speech understanding , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[11]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[12]  Alex Acero,et al.  Speech utterance classification , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..