'Could You Describe the Reason for the Transfer?': A Reinforcement Learning Based Voice-Enabled Bot Protecting Customers from Financial Frauds

With the booming of the Internet finance and e-payment business, telecom and online fraud has become a serious problem which grows rapidly. In China, 351 billion RMB (approximately 0.3% of China's GDP) was lost in 2018 due to telecommunication and online fraud, influencing tens of millions of individual customers. Anti-fraud algorithms have been widely adopted by major Internet finance companies to detect and block transactions induced by scam. However, due to limited contextual information, most systems would probably mistakenly block the normal transactions, leading to poor user experience. On the other hand, if the transactions induced by scam are detected yet not fully explained to the users, the users will continue to pay, suffering from direct financial losses. To address these problems, we design a voice-enabled bot that interacts with the customers who are involved with potential telecommunication and online frauds decided by the back-end system. The bot seeks additional information from the customers through natural conversations to confirm whether the customers are scammed and identify the actual fraud types. The details about the frauds are then provided to convince the customers that they are on the edge of being scammed. Our bot adopts offline reinforcement learning (RL) to learn dialogue policies from real-world human-human chat logs. During the conversations, our bot also identifies fraud types every turn based on the dialogue state. The bot proposed outperforms baseline dialogue strategies by 2.8% in terms of task success rate, and 5% in terms of dialogue accuracy in offline evaluations. Furthermore, in the 8 months of real-world deployment, our bot lowers the dissatisfaction rate by 25% and increases the fraud prevention rate by 135% relatively, indicating a significant improvement in user experience as well as anti-fraud effectiveness. More importantly, we help prevent millions of users from being deceived, and avoid trillions of financial losses.

[1]  Jianfeng Gao,et al.  End-to-End Task-Completion Neural Dialogue Systems , 2017, IJCNLP.

[2]  Zhao Li,et al.  Online E-Commerce Fraud: A Large-Scale Detection and Analysis , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[3]  Wayne Xin Zhao,et al.  KERL , 2020, Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.

[4]  Weiyan Shi,et al.  Refine and Imitate: Reducing Repetition and Inconsistency in Persuasion Dialogues via Reinforcement Learning and Human Demonstration , 2020, EMNLP.

[5]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[6]  J. Christopher Westland,et al.  Employing transaction aggregation strategy to detect credit card fraud , 2012, Expert Syst. Appl..

[7]  Zhoujun Li,et al.  Building Task-Oriented Dialogue Systems for Online Shopping , 2017, AAAI.

[8]  James R. Glass,et al.  Quantifying Exposure Bias for Neural Language Generation , 2019, ArXiv.

[9]  Jianfeng Gao,et al.  Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access , 2016, ACL.

[10]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[11]  Ani Nenkova,et al.  Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , 2016, NAACL 2016.

[12]  Janis Grundspenkis,et al.  A Systematic Approach to Implementing Chatbots in Organizations - RTU Leo Showcase , 2018, BIR Workshops.

[13]  Yu Fan,et al.  KERL: A Knowledge-Guided Reinforcement Learning Model for Sequential Recommendation , 2020, SIGIR.

[14]  Ying Chen,et al.  Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network , 2018, ACL.

[15]  Yafang Wang,et al.  Two-stage Behavior Cloning for Spoken Dialogue System in Debt Collection , 2020, IJCAI.

[16]  Lei Xie,et al.  Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition , 2020, INTERSPEECH.

[17]  Feng Ji,et al.  Memory-Augmented Dialogue Management for Task-Oriented Dialogue Systems , 2018, ACM Trans. Inf. Syst..

[18]  Yang Zhao,et al.  xFraud: Explainable Fraud Transaction Detection on Heterogeneous Graphs , 2020, ArXiv.

[19]  Kam-Fai Wong,et al.  Integrating planning for task-completion dialogue policy learning , 2018, ACL.

[20]  Yuan Qi,et al.  TitAnt: Online Real-time Transaction Fraud Detection in Ant Financial , 2019, Proc. VLDB Endow..

[21]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[22]  Rafael San Miguel Carrasco,et al.  Evaluation of Deep Neural Networks for Reduction of Credit Card Fraud Alerts , 2020, IEEE Access.

[23]  Stefan Ultes,et al.  Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management , 2017, SIGDIAL Conference.

[24]  Sanghyun Yi A Chatbot by Combining Finite State Machine , Information Retrieval , and Bot-Initiative Strategy , 2017 .

[25]  Maxine Eskénazi,et al.  Towards End-to-End Learning for Dialog State Tracking and Management using Deep Reinforcement Learning , 2016, SIGDIAL Conference.

[26]  Martial Hebert,et al.  Learning Transferable Policies for Monocular Reactive MAV Control , 2016, ISER.

[27]  J. Andrew Bagnell,et al.  Efficient Reductions for Imitation Learning , 2010, AISTATS.

[28]  Claude Sammut,et al.  A Framework for Behavioural Cloning , 1995, Machine Intelligence 15.

[29]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[30]  Denny Britz,et al.  Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models , 2017, EMNLP.

[31]  Milica Gasic,et al.  Gaussian Processes for POMDP-Based Dialogue Manager Optimization , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[32]  Jianfeng Gao,et al.  Few-shot Natural Language Generation for Task-Oriented Dialog , 2020, FINDINGS.

[33]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[34]  Steve J. Young,et al.  Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[35]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[36]  Stuart J. Russell Learning agents for uncertain environments (extended abstract) , 1998, COLT' 98.

[37]  Michael F. McTear,et al.  Modelling spoken dialogues with state transition diagrams: experiences with the CSLU toolkit , 1998, ICSLP.

[38]  Dhruv Grewal,et al.  How artificial intelligence will change the future of marketing , 2019, Journal of the Academy of Marketing Science.

[39]  Geoffrey Zweig,et al.  Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning , 2017, ACL.

[40]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[41]  Gholamreza Haffari,et al.  Few-shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning , 2020, EMNLP.

[42]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[43]  Hao Wang,et al.  A Dataset for Research on Short-Text Conversations , 2013, EMNLP.

[44]  Mohammed J. Zaki,et al.  Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases , 2019, NAACL.

[45]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[46]  David Vandyke,et al.  On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems , 2016, ACL.

[47]  Jiliang Tang,et al.  A Survey on Dialogue Systems: Recent Advances and New Frontiers , 2017, SKDD.

[48]  Anca D. Dragan,et al.  DART: Noise Injection for Robust Imitation Learning , 2017, CoRL.

[49]  Ian McLoughlin,et al.  SAN-M: Memory Equipped Self-Attention for End-to-End Speech Recognition , 2020, INTERSPEECH.