Knowledge-aware Multimodal Fashion Chatbot

Multimodal fashion chatbot provides a natural and informative way to fulfill customers' fashion needs. However, making it 'smart' in generating substantive responses remains a challenging problem. In this paper, we present a multimodal domain knowledge enriched fashion chatbot. It forms a taxonomy-based learning module to capture the fine-grained semantics in images and leverages an end-to-end neural conversational model to generate responses based on the conversation history, visual semantics, and domain knowledge. To avoid inconsistent dialogues, deep reinforcement learning method is used to further optimize the model.