Practical User Feedback-driven Internal Search Using Online Learning to Rank

We present a system, Spoke, for creating and searching internal knowledge base (KB) articles for organizations. Spoke is available as a SaaS (Software-as-a-Service) product deployed across hundreds of organizations with a diverse set of domains. Spoke continually improves search quality using conversational user feedback which allows it to provide better search experience than standard information retrieval systems without encoding any explicit domain knowledge. We achieve this by using a real-time online learning-to-rank (L2R) algorithm that automatically customizes relevance scoring for each organization deploying Spoke by using a query similarity kernel. The focus of this paper is on incorporating practical considerations into our relevance scoring function and algorithm that make Spoke easy to deploy and suitable for handling events that naturally happen over the life-cycle of any KB deployment. We show that Spoke outperforms competitive baselines by up to 41% in offline F1 comparisons.

[1]  Chris Callison-Burch,et al.  PPDB: The Paraphrase Database , 2013, NAACL.

[2]  Jon M. Kleinberg,et al.  Mining the Web's Link Structure , 1999, Computer.

[3]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[4]  Artem Grotov,et al.  Online Learning to Rank for Information Retrieval: SIGIR 2016 Tutorial , 2016, SIGIR.

[5]  Fausto Rabitti,et al.  Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval , 1986 .

[6]  J. C. Platt,et al.  Proceedings of the 20th International Conference on Neural Information Processing Systems , 2007 .

[7]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[8]  Amit Singhal,et al.  Pivoted document length normalization , 1996, SIGIR 1996.

[9]  J. Langford,et al.  The Epoch-Greedy algorithm for contextual multi-armed bandits , 2007, NIPS 2007.

[10]  Cícero Nogueira dos Santos,et al.  Detecting Semantically Equivalent Questions in Online User Forums , 2015, CoNLL.

[11]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[12]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[13]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[14]  Proceedings of the 25th ACM International on Conference on Information and Knowledge Management , 2016 .

[15]  Philip H. Enslow,et al.  Proceedings of the seventh international conference on World Wide Web 7 , 1998 .

[16]  Donna K. Harman,et al.  Overview of the Third Text REtrieval Conference (TREC-3) , 1995, TREC.

[17]  Jia Li,et al.  Approximating Graph Pattern Queries Using Views , 2016, CIKM.

[18]  Cícero Nogueira dos Santos,et al.  Improved Answer Selection with Pre-Trained Word Embeddings , 2017, ArXiv.

[19]  Hongbo Deng,et al.  Ranking Relevance in Yahoo Search , 2016, KDD.

[20]  H. J. Mclaughlin,et al.  Learn , 2002 .

[21]  Alex Smola,et al.  Kernel methods in machine learning , 2007, math/0701907.

[22]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[23]  Joachim Bingel,et al.  Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics , 2016 .

[24]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[25]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[26]  Yoram Singer,et al.  A primal-dual perspective of online learning algorithms , 2007, Machine Learning.

[27]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[28]  Yanjun Qi,et al.  Learning to rank with (a lot of) word features , 2010, Information Retrieval.

[29]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[30]  Oren Etzioni,et al.  Paraphrase-Driven Learning for Open Question Answering , 2013, ACL.

[31]  Filip Radlinski,et al.  Embedding Search into a Conversational Platform to Support Collaborative Search , 2019, CHIIR.