Conversational agents or chatbots (short for chat robot) are a branch of Natural Language Processing (NLP) that has arisen a lot of interest nowadays due to the extent number of applications in company services such as customer support or automatized FAQS and personal asistent services, for instance Siri or Cortana. There are three types: rule-based models, retrieval-based models and generative-based models. The difference between them is the freedom they have at the time of generating an answer given a question. The chatbot models usually used in public services are rule-based or retrieval-based given the need to guarantee quality and adecuate answers to users. But these models can handle only conversations aligned with their previous written answers and, therefore, conversations can sometimes sound artificial if it goes out of the topic. Generative-based models can handle better an open conversation which makes them a more generalizable approach. Promising results have been achieved in generative-based models by applying neural machine translation techniques with the recurrent encoder/decoder architecture. In this project is implemented, compared and analyzed two generative models that constitute the state of the art in neural machine translation applied to chatbots. One model is based on recurrence with attention and the other is based exclusively in attention. Additionally, the model based exclusively on recurrence has been used as a reference. Experiments show that, as in translation, an architecture based only in attention mechanisms obtains better results than the recurrence based models.
[1]
Yoshua Bengio,et al.
Neural Machine Translation by Jointly Learning to Align and Translate
,
2014,
ICLR.
[2]
Joelle Pineau,et al.
The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
,
2015,
SIGDIAL Conference.
[3]
Quoc V. Le,et al.
A Neural Conversational Model
,
2015,
ArXiv.
[4]
Quoc V. Le,et al.
Sequence to Sequence Learning with Neural Networks
,
2014,
NIPS.
[5]
Jürgen Schmidhuber,et al.
Long Short-Term Memory
,
1997,
Neural Computation.
[6]
Aurélien Géron,et al.
Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
,
2017
.
[7]
Lukasz Kaiser,et al.
Attention is All you Need
,
2017,
NIPS.
[8]
Joseph Weizenbaum,et al.
and Machine
,
1977
.
[9]
Yoshua Bengio,et al.
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
,
2014,
EMNLP.
[10]
Lukasz Kaiser,et al.
One Model To Learn Them All
,
2017,
ArXiv.
[11]
Jörg Tiedemann,et al.
News from OPUS — A collection of multilingual parallel corpora with tools and interfaces
,
2009
.
[12]
Jimmy Ba,et al.
Adam: A Method for Stochastic Optimization
,
2014,
ICLR.