Modeling Human-Machine Interaction by Means of a Sample Selection Method

This paper presents a practical application of Sample Selection techniques to model the process of selecting the next system response of a conversational agent. Our proposal deals with the important problem of imbalanced training data that is usually present in the selected application domain. This process is modeled as a classification task that takes the dialog history as input, and selects the next system response as output. Our proposal improves the classifier’s performance by automatically selecting examples that are difficult to classify during the training phase, considering the criteria of proximity to the border and the typicality of the examples. We present a practical application of this technique for a conversational agent providing railway information. Simulation results support the usefulness of the proposed approach to provide the better selection of the responses of the conversational agent.

[1]  Lawrence D. Jackel,et al.  Large Automatic Learning, Rule Extraction, and Generalization , 1987, Complex Syst..

[2]  A. Lyhyaoui,et al.  Intrusion Detection based Sample Selection for imbalanced data distribution , 2012, Second International Conference on the Innovative Computing Technology (INTECH 2012).

[3]  Aníbal R. Figueiras-Vidal,et al.  Sample selection via clustering to construct support vector-like classifiers , 1999, IEEE Trans. Neural Networks.

[4]  Ramón López-Cózar,et al.  A domain-independent statistical methodology for dialog management in spoken dialog systems , 2014, Comput. Speech Lang..

[5]  Byoung-Tak Zhang,et al.  Accelerated Learning by Active Example Selection , 1994, Int. J. Neural Syst..

[6]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[7]  Byoung-Tak Zhang,et al.  An incremental learning algorithm that optimizes network size and sample size in one trial , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[8]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[9]  David Griol,et al.  A statistical approach to spoken dialog systems design and evaluation , 2008, Speech Commun..

[10]  Roberto Pieraccini The Voice in the Machine: Building Computers That Understand Speech , 2012 .

[11]  A. Lyhyaoui,et al.  Learning from imbalanced data using methods of sample selection , 2012, 2012 International Conference on Multimedia Computing and Systems.

[12]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[13]  Jack Sklansky,et al.  Locally Trained Piecewise Linear Classifiers , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Stanley C. Ahalt,et al.  Competitive learning algorithms for vector quantization , 1990, Neural Networks.

[15]  David P. Williams,et al.  Mine Classification With Imbalanced Data , 2009, IEEE Geoscience and Remote Sensing Letters.

[16]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[17]  Byoung-Tak Zhang,et al.  Genetic Programming of Minimal Neural Nets Using Occam's Razor , 1993, ICGA.

[18]  Aníbal R. Figueiras-Vidal,et al.  An emphasized target smoothing procedure to improve MLP classifiers performance , 2008, ESANN.

[19]  Paul W. Munro,et al.  Repeat Until Bored: A Pattern Selection Strategy , 1991, NIPS.

[20]  Yuhua Li,et al.  Selecting Critical Patterns Based on Local Geometrical and Statistical Information , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Byoung-Tak Zhang,et al.  Neural networks that teach themselves through genetic discovery of novel examples , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.