Nyström Methods for Efficient Kernel-Based Methods for Community Question Answering

English. Expressive but complex kernel functions, such as Sequence or Tree kernels, are usually underemployed in NLP tasks, e.g., in community Question Answering (cQA), as for their significant complexity in both learning and classification stages. Recently, the Nyström methodology for data embedding has been proposed as a viable solution to scalability problems. By mapping data into low-dimensional approximations of kernel spaces, it positively increases scalability through compact linear representations for highly structured data. In this paper, we show that Nyström methodology can be effectively used to apply a kernel-based method in the cQA task, achieving stateof-the-art results by reducing the computational cost of orders of magnitude. Italiano. Metodi di apprendimento automatico basato su funzioni kernel complesse, come Sequence o Tree Kernel, rischiano di non poter essere adeguatamente utilizzati in problemi legati all’elaborazione del linguaggio naturale (come ad esempio in Community Question Answering) a causa degli alti costi computazionali per l’addestramento e la classificazione. Recentemente é stata proposta una metodologia, basata sul metodo di Nyström, per poter far fronte a questi problemi di scalabilitá: essa permette di proiettare gli esempi, osservabili in fase di addestramento e classificazione, all’interno di spazi a bassa dimensionalitá che approssimano lo spazio sottostante la funzione kernel. Queste rappresentazioni compatte permettono di applicare algoritmi di apprendimento automatico estremamente efficienti e scalabili. In questo lavoro si dimostra che é possibile applicare metodi kernel al problema di Community Question Answering, ottenendo risultati che sono lo stato dell’arte, riducendo di ordini di grandezza i costi

[1]  Ameet Talwalkar,et al.  Sampling Methods for the Nyström Method , 2012, J. Mach. Learn. Res..

[2]  Slobodan Vucetic,et al.  Online Passive-Aggressive Algorithms on a Budget , 2010, AISTATS.

[3]  Roberto Basili,et al.  KeLP: a Kernel-based Learning Platform for Natural Language Processing , 2015, ACL.

[4]  Roberto Basili,et al.  Exploiting Syntactic and Shallow Semantic Kernels for Question Answer Classification , 2007, ACL.

[5]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[6]  Roberto Basili,et al.  A Stratified Strategy for Efficient Kernel-Based Learning , 2015, AAAI.

[7]  Claudio Gentile,et al.  Tracking the best hyperplane with a simple budget Perceptron , 2006, Machine Learning.

[8]  Roberto Basili,et al.  Large-Scale Kernel-Based Language Learning Through the Ensemble Nystr đdoto o ¨ m Methods , 2016, ECIR.

[9]  Roberto Basili,et al.  Effective Kernelized Online Learning in Language Processing Tasks , 2014, ECIR.

[10]  Jean-Michel Renders,et al.  Word-Sequence Kernels , 2003, J. Mach. Learn. Res..

[11]  Barbara Caputo,et al.  The projectron: a bounded kernel-based Perceptron , 2008, ICML '08.

[12]  Fabio Massimo Zanzotto,et al.  Distributed Tree Kernels , 2012, ICML.

[13]  Preslav Nakov,et al.  SemEval-2016 Task 3: Community Question Answering , 2019, *SEMEVAL.

[14]  Yoram Singer,et al.  Support Vector Machines on a Budget , 2006, NIPS.

[15]  Shafiq R. Joty,et al.  ConvKN at SemEval-2016 Task 3: Answer and Question Selection for Question Answering on Arabic and English Fora , 2016, *SEMEVAL.

[16]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[17]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[18]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[19]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[20]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[21]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[22]  Alessandro Moschitti,et al.  Building structures from classifiers for passage reranking , 2013, CIKM.

[23]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[24]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[25]  Roberto Basili,et al.  KeLP at SemEval-2016 Task 3: Learning Semantic Relations between Questions and Answers , 2016, *SEMEVAL.

[26]  Preslav Nakov,et al.  SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings , 2016, *SEMEVAL.

[27]  Andrew Zisserman,et al.  Efficient Additive Kernels via Explicit Feature Maps , 2012, IEEE Trans. Pattern Anal. Mach. Intell..