Meta-learning of textual representations

Recent progress in AutoML has lead to state-of-the-art methods (e.g., AutoSKLearn) that can be readily used by non-experts to approach any supervised learning problem. Whereas these methods are quite effective, they are still limited in the sense that they work for tabular (matrix formatted) data only. This paper describes one step forward in trying to automate the design of supervised learning methods in the context of text mining. We introduce a meta learning methodology for automatically obtaining a representation for text mining tasks starting from raw text. We report experiments considering 60 different textual representations and more than 80 text mining datasets associated to a wide variety of tasks. Experimental results show the proposed methodology is a promising solution to obtain highly effective off the shell text classification pipelines.

[1]  Lars Kotthoff,et al.  Automated Machine Learning: Methods, Systems, Challenges , 2019, The Springer Series on Challenges in Machine Learning.

[2]  Sergio Escalera,et al.  Analysis of the AutoML Challenge Series 2015-2018 , 2019, Automated Machine Learning.

[3]  Dani Yogatama,et al.  Bayesian Optimization of Text Representations , 2015, EMNLP.

[4]  Ricardo Vilalta,et al.  A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.

[5]  David Pinto,et al.  On Clustering and Evaluation of Narrow Domain Short-Text Corpora , 2009, Proces. del Leng. Natural.

[6]  Hugo Jair Escalante,et al.  Particle Swarm Model Selection , 2009, J. Mach. Learn. Res..

[7]  Marie-Francine Moens,et al.  Evolutionary learning of meta-rules for text classification , 2017, GECCO.

[8]  Aaron Klein,et al.  Efficient and Robust Automated Machine Learning , 2015, NIPS.

[9]  David Eduardo,et al.  On Clustering and Evaluation of Narrow Domain Short-Test Corpora , 2009 .

[10]  Frank Hutter,et al.  Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.

[11]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[12]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[13]  Frank Hutter,et al.  Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..

[14]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[15]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[16]  Ryan L. Boyd,et al.  The Development and Psychometric Properties of LIWC2015 , 2015 .

[17]  Yoshua Bengio,et al.  Gradient-Based Optimization of Hyperparameters , 2000, Neural Computation.

[18]  Wai Lam,et al.  A meta-learning approach for text categorization , 2001, SIGIR '01.

[19]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.