论文信息 - Efficient Hyper-parameter Optimization for NLP Applications - 字舞流文

Efficient Hyper-parameter Optimization for NLP Applications

Hyper-parameter optimization is an important problem in natural language processing (NLP) and machine learning. Recently, a group of studies has focused on using sequential Bayesian Optimization to solve this problem, which aims to reduce the number of iterations and trials required during the optimization process. In this paper, we explore this problem from a different angle, and propose a multi-stage hyper-parameter optimization that breaks the problem into multiple stages with increasingly amounts of data. Early stage provides fast estimates of good candidates which are used to initialize later stages for better performance and speed. We demonstrate the utility of this new algorithm by evaluating its speed and accuracy against state-of-the-art Bayesian Optimization algorithms on classification and prediction tasks.

Bowen Zhou | Bing Xiang | Lidan Wang | Sridhar Mahadevan | Minwei Feng | S. Mahadevan | Minwei Feng | Bing Xiang | Bowen Zhou | Lidan Wang

[1] Chih-Jen Lin,et al. LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[2] Jasper Snoek,et al. Multi-Task Bayesian Optimization , 2013, NIPS.

[3] M VoorheesEllen. The TREC question answering track , 2001 .

[4] Mikhail Bilenko,et al. Lazy Paired Hyper-Parameter Tuning , 2013, IJCAI.

[5] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[6] Kevin Leyton-Brown,et al. Surrogate Benchmarks for Hyperparameter Optimization , 2014, MetaSel@ECAI.

[7] Chris Eliasmith,et al. Hyperopt-Sklearn: Automatic Hyperparameter Configuration for Scikit-Learn , 2014, SciPy.

[8] Kevin Leyton-Brown,et al. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[9] Cristina V. Lopes,et al. Bagging gradient-boosted trees for high precision, low variance ranking models , 2011, SIGIR.

[10] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[11] Frank Hutter,et al. Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.

[12] Kevin Leyton-Brown,et al. Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[13] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[14] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[15] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[16] Kevin Leyton-Brown,et al. Efficient Benchmarking of Hyperparameter Optimizers via Surrogates , 2015, AAAI.

[17] Gideon S. Mann,et al. Efficient Transfer Learning Method for Automatic Hyperparameter Tuning , 2014, AISTATS.

[18] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Michèle Sebag,et al. Collaborative hyperparameter tuning , 2013, ICML.

[20] Kevin Leyton-Brown,et al. An Efficient Approach for Assessing Hyperparameter Importance , 2014, ICML.