Computational Linguistics and Intelligent Text Processing

Character-based models become more and more popular for different natural language processing task, especially due to the success of neural networks. They provide the possibility of directly model text sequences without the need of tokenization and, therefore, enhance the traditional preprocessing pipeline. This paper provides an overview of character-based models for a variety of natural language processing tasks. We group existing work in three categories: tokenization-based approaches, bag-of-n-gram models and end-to-end models. For each category, we present prominent examples of studies with a particular focus on recent character-based deep learning work.

[1]  Hiroyuki Shinnou,et al.  Use of Combined Topic Models in Unsupervised Domain Adaptation for Word Sense Disambiguation , 2013, PACLIC.

[2]  Fredrik Olsson,et al.  A literature survey of active machine learning in the context of natural language processing , 2009 .

[3]  Marius Thomas Lindauer,et al.  AutoFolio: An Automatically Configured Algorithm Selector , 2015, J. Artif. Intell. Res..

[4]  Luc De Raedt,et al.  Interactive concept-learning and constructive induction by analogy , 2004, Machine Learning.

[5]  Johannes Fürnkranz,et al.  Shorter Rules Are Better, Aren't They? , 2016, DS.

[6]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[7]  Ivan Titov,et al.  Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework , 2014, NAACL.

[8]  Hitoshi Isahara,et al.  The RWC text databases , 1998, LREC.

[9]  Kristian Kersting,et al.  Lifted Inference for Convex Quadratic Programs , 2017, AAAI.

[10]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[11]  Minoru Sasaki,et al.  Unsupervised Domain Adaptation for Word Sense Disambiguation using Stacked Denoising Autoencoder , 2015, PACLIC.

[12]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[13]  Iryna Gurevych,et al.  Automatic Annotation Suggestions and Custom Annotation Layers in WebAnno , 2014, ACL.

[14]  Jeffrey Heer,et al.  The Effects of Interactive Latency on Exploratory Visual Analysis , 2014, IEEE Transactions on Visualization and Computer Graphics.

[15]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[16]  Daumé,et al.  Frustratingly Easy Semi-Supervised Domain Adaptation , 2010 .

[17]  Kristian Kersting,et al.  Imitation Learning in Relational Domains: A Functional-Gradient Boosting Approach , 2011, IJCAI.

[18]  Hinrich Schütze,et al.  AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes , 2015, ACL.

[19]  Khaled Shaalan,et al.  A hybrid approach to Arabic named entity recognition , 2014, J. Inf. Sci..

[20]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[21]  Hwee Tou Ng,et al.  Estimating Class Priors in Domain Adaptation for Word Sense Disambiguation , 2006, ACL.

[22]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[24]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[25]  Christopher De Sa,et al.  Data Programming: Creating Large Training Sets, Quickly , 2016, NIPS.

[26]  Jordi Girona Salgado An Empirical Study of the Domain Dependence of Supervised Word Sense Disambiguation Systems , 2000 .

[27]  Tim Kraska,et al.  MLbase: A Distributed Machine-learning System , 2013, CIDR.

[28]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[29]  Minoru Sasaki,et al.  Learning under Covariate Shift for Domain Adaptation for Word Sense Disambiguation , 2015, PACLIC.

[30]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[31]  Kristen Grauman,et al.  Interactively building a discriminative vocabulary of nameable attributes , 2011, CVPR 2011.

[32]  Kate Saenko,et al.  Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.

[33]  Kikuo Maekawa,et al.  Balanced corpus of contemporary written Japanese , 2013, Language Resources and Evaluation.

[34]  Jiasen Lu,et al.  Hierarchical Question-Image Co-Attention for Visual Question Answering , 2016, NIPS.

[35]  Tao Chen,et al.  Improving Distributed Representation of Word Sense via WordNet Gloss Composition and Context Clustering , 2015, ACL.

[36]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[37]  Maurizio Marchese,et al.  Crowdsourcing Processes: A Survey of Approaches and Opportunities , 2016, IEEE Internet Computing.

[38]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[39]  Hwee Tou Ng,et al.  Semi-Supervised Word Sense Disambiguation Using Word Embeddings in General and Specific Domains , 2015, NAACL.

[40]  Minoru Sasaki,et al.  Active Learning to Remove Source Instances for Domain Adaptation for Word Sense Disambiguation , 2015, PACLING.

[41]  Stephen J. Roberts,et al.  Bayesian Methods for Intelligent Task Assignment in Crowdsourcing Systems , 2015, Decision Making.

[42]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[43]  Gabriela Csurka,et al.  A Domain Adaptation Regularization for Denoising Autoencoders , 2016, ACL.

[44]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[45]  David M. Blei,et al.  Deep Exponential Families , 2014, AISTATS.

[46]  Manabu Okumura,et al.  SemEval-2010 Task: Japanese WSD , 2010, SemEval@ACL.

[47]  Eneko Agirre,et al.  Supervised Domain Adaption for WSD , 2009, EACL.

[48]  Piek T. J. M. Vossen,et al.  Addressing the MFS Bias in WSD systems , 2016, LREC.

[49]  Longxin Lin Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.

[50]  Yee Whye Teh,et al.  Scalable Structure Discovery in Regression using Gaussian Processes , 2016, AutoML@ICML.

[51]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[52]  Don R. Hush,et al.  Interactive Machine Learning in Data Exploitation , 2013, Computing in Science & Engineering.

[53]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[54]  Stephen J. Roberts,et al.  Dynamic Bayesian Combination of Multiple Imperfect Classifiers , 2012, Decision Making and Imperfection.

[55]  Hwee Tou Ng,et al.  Domain Adaptation with Active Learning for Word Sense Disambiguation , 2007, ACL.

[56]  Manabu Okumura,et al.  Automatic Determination of a Domain Adaptation Method for Word Sense Disambiguation Using Decision Tree Learning , 2011, IJCNLP.

[57]  Marc Toussaint,et al.  Exploration in relational domains for model-based reinforcement learning , 2012, J. Mach. Learn. Res..

[58]  Matthieu Geist,et al.  Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[59]  Jing Peng,et al.  Experiments in Idiom Recognition , 2016, COLING.

[60]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[61]  Eric Horvitz,et al.  Principles of Lifelong Learning for Predictive User Modeling , 2007, User Modeling.

[62]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[63]  George Papandreou,et al.  Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation , 2015, ArXiv.

[64]  Hinrich Schütze,et al.  Word Embedding Calculus in Meaningful Ultradense Subspaces , 2016, ACL.

[65]  Timothy Baldwin,et al.  A Word Embedding Approach to Predicting the Compositionality of Multiword Expressions , 2015, NAACL.

[66]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[67]  Jing Peng,et al.  Classifying Idiomatic and Literal Expressions Using Vector Space Representations , 2015, RANLP.

[68]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[69]  Kristian Kersting,et al.  Relational linear programming , 2017, Artif. Intell..

[70]  Hiroya Takamura,et al.  Context Representation with Word Embeddings for WSD , 2015, PACLING.

[71]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[72]  Christopher D. Manning,et al.  Learning Language Games through Interaction , 2016, ACL.

[73]  Nesime Tatbul,et al.  Load Shedding , 2009, Encyclopedia of Database Systems.

[74]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[75]  Andreas Vlachos,et al.  Imitation learning for language generation from unaligned data , 2016, COLING.

[76]  John D. Kelleher,et al.  Idiom Token Classification using Sentential Distributed Semantics , 2016, ACL.

[77]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[78]  Fredrik Olsson,et al.  A Web Survey on the Use of Active Learning to Support Annotation of Text Data , 2009, HLT-NAACL 2009.

[79]  Manabu Okumura,et al.  Automatic Domain Adaptation for Word Sense Disambiguation Based on Comparison of Multiple Classifiers , 2012, PACLIC.

[80]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[81]  Eneko Agirre,et al.  On Robustness and Domain Adaptation using SVD for Word Sense Disambiguation , 2008, COLING.

[82]  Thorsten Joachims,et al.  Coactive Learning , 2015, J. Artif. Intell. Res..

[83]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[84]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[85]  Khaled Shaalan,et al.  Integrating Rule-Based System with Classification for Arabic Named Entity Recognition , 2012, CICLing.

[86]  Sriraam Natarajan,et al.  Actively Interacting with Experts: A Probabilistic Logic Approach , 2016, ECML/PKDD.

[87]  Michael J. Cafarella,et al.  Neighbor-Sensitive Hashing , 2015, Proc. VLDB Endow..

[88]  Young-Bum Kim,et al.  New Transfer Learning Techniques for Disparate Label Sets , 2015, ACL.

[89]  Burr Settles,et al.  Active Learning , 2012, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[90]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[91]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[92]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[93]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[94]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[95]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[96]  N. A. Khovanova,et al.  Machine Learning for Predictive Modelling based on Small Data in Biomedical Engineering , 2015 .

[97]  Xi Chen,et al.  Learning From Demonstration in the Wild , 2018, 2019 International Conference on Robotics and Automation (ICRA).