Learning to Rap Battle with Bilingual Recursive Neural Networks

We describe an unconventional line of attack in our quest to teach machines how to rap battle by improvising hip hop lyrics on the fly, in which a novel recursive bilingual neural network, TRAAM, implicitly learns soft, context-dependent generalizations over the structural relationships between associated parts of challenge and response raps, while avoiding the exponential complexity costs that symbolic models would require. TRAAM learns feature vectors simultaneously using context from both the challenge and the response, such that challenge-response association patterns with similar structure tend to have similar vectors. Improvisation is modeled as a quasi-translation learning problem, where TRAAM is trained to improvise fluent and rhyming responses to challenge lyrics. The soft structural relationships learned by our TRAAM model are used to improve the probabilistic responses generated by our improvisational response component.

[1]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[2]  Ivan Titov,et al.  Inducing Crosslingual Distributed Representations of Words , 2012, COLING.

[3]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[4]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[5]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[6]  Nick Cercone,et al.  Computational Linguistics , 1986, Communications in Computer and Information Science.

[7]  François Pachet,et al.  Markov Constraints for Generating Lyrics with Style , 2012, ECAI.

[8]  Richard Edwin Stearns,et al.  Syntax-Directed Transduction , 1966, JACM.

[9]  Alexandre Allauzen,et al.  Continuous Space Translation Models with Neural Networks , 2012, NAACL.

[10]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[11]  Long Jiang,et al.  Generating Chinese Couplets using a Statistical MT Approach , 2008, COLING.

[12]  Yang Liu,et al.  Recursive Autoencoders for ITG-Based Translation , 2013, EMNLP.

[13]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[14]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[15]  Dekai Wu,et al.  A Polynomial-Time Algorithm for Statistical Machine Translation , 1996, ACL.

[16]  Holger Schwenk,et al.  Continuous Space Translation Models for Phrase-Based Statistical Machine Translation , 2012, COLING.

[17]  Dekai Wu,et al.  Evaluating Improvised Hip Hop Lyrics - Challenges and Observations , 2014, LREC.

[18]  Journal of the Association for Computing Machinery , 1961, Nature.

[19]  Carlos Martín-Vide,et al.  First International Conference on Statistical Language and Speech Processing, SLSP 2013 , 2016, Comput. Speech Lang..

[20]  Dekai Wu,et al.  Unsupervised Rhyme Scheme Identification in Hip Hop Lyrics Using Hidden Markov Models , 2013, SLSP.

[21]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[22]  Dinh Phung,et al.  Journal of Machine Learning Research: Preface , 2014 .

[23]  Jianfeng Gao,et al.  Learning Continuous Phrase Representations for Translation Modeling , 2014, ACL.

[24]  Christopher D. Manning,et al.  Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.

[25]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[26]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[27]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[28]  Phil Blunsom,et al.  Multilingual Models for Compositional Distributed Semantics , 2014, ACL.

[29]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[30]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[31]  Dekai Wu,et al.  LTG vs. ITG Coverage of Cross-Lingual Verb Frame Alternations , 2012, EAMT.

[32]  Hermann Ney,et al.  A Comparative Study on Reordering Constraints in Statistical Machine Translation , 2003, ACL.

[33]  Andreas Stolcke,et al.  Tree matching with recursive distributed representations , 1992, AAAI Conference on Artificial Intelligence.

[34]  Joakim Nivre,et al.  Learning Stochastic Bracketing Inversion Transduction Grammars with a Cubic Time Biparsing Algorithm , 2009, IWPT.

[35]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[36]  Peter Kulchyski and , 2015 .

[37]  G. Miller,et al.  Cognitive science. , 1981, Science.