Edinburgh Research Explorer Local String Transduction as Sequence Labeling

We show that the general problem of string transduction can be reduced to the problem of sequence labeling. While character deletions and insertions are allowed in string transduction, they do not exist in sequence labeling. We show how to overcome this difference. Our approach can be used with any sequence labeling algorithm and it works best for problems in which string transduction imposes a strong notion of locality (no long range dependencies). We experiment with spelling correction for social media, OCR correction, and morphological inflection, and we see that it behaves better than seq2seq models and yields state-of-the-art results in several cases.

[1]  Yoav Goldberg,et al.  Morphological Inflection Generation with Hard Monotonic Attention , 2016, ACL.

[2]  Thomas Breuel,et al.  Sequence-to-sequence neural network models for transliteration , 2016, ArXiv.

[3]  Iryna Gurevych,et al.  Still not there? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks , 2016, COLING.

[4]  Miikka Silfverberg,et al.  Data-Driven Spelling Correction using Weighted Finite-State Methods , 2016, ACL 2016.

[5]  Ryan Cotterell,et al.  Weighting Finite-State Transductions With Neural Context , 2016, NAACL.

[6]  Katharina Kann,et al.  Single-Model Encoder-Decoder with Explicit Morphological Representation for Reinflection , 2016, ACL.

[7]  Shashi Narayan,et al.  Diversity in Spectral Learning for Natural Language Parsing , 2015, EMNLP.

[8]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[9]  Ryan Cotterell,et al.  Stochastic Contextual Edit Distance and Probabilistic FSTs , 2014, ACL.

[10]  Jane Chandlee,et al.  STRICTLY LOCAL PHONOLOGICAL PROCESSES , 2014 .

[11]  Thomas M. Breuel,et al.  Normalizing historical orthography for OCR historical documents using LSTM , 2013, HIP '13.

[12]  Alexander M. Rush,et al.  Spectral Learning of Refinement HMMs , 2013, CoNLL.

[13]  Karl Stratos,et al.  Experiments with Spectral Learning of Latent-Variable PCFGs , 2013, HLT-NAACL.

[14]  José-Luis Sancho-Gómez,et al.  Word Normalization in Twitter Using Finite-state Transducers , 2013, Tweet-Norm@SEPLN.

[15]  Ariadna Quattoni,et al.  Unsupervised Spectral Learning of Finite State Transducers , 2013, NIPS.

[16]  Markus Dreyer,et al.  A non-parametric model for the discovery of inflectional paradigms from plain text using graphical models over strings , 2011 .

[17]  Markus Dreyer,et al.  Latent-Variable Modeling of String Transductions with Finite-State Methods , 2008, EMNLP.

[18]  Johan Schalkwyk,et al.  OpenFst: A General and Efficient Weighted Finite-State Transducer Library , 2007, CIAA.

[19]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[20]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[21]  S. H. A N K A R K U M A R,et al.  A weighted finite state transducer translation template model for statistical machine translation , 2005, Natural Language Engineering.

[22]  Jason Eisner,et al.  Parameter Estimation for Probabilistic Finite-State Transducers , 2002, ACL.

[23]  Alexander Clark Partially Supervised Learning of Morphology with Stochastic Transducers , 2001, NLPRS.

[24]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[25]  Kevin Knight,et al.  Machine Transliteration , 1997, CL.

[26]  Kimmo Koskenniemi,et al.  Two-Level Morphology , 1983 .