i, Poet: Automatic Chinese Poetry Composition through a Generative Summarization Framework under Constrained Optimization

Part of the long lasting cultural heritage of China is the classical ancient Chinese poems which follow strict formats and complicated linguistic rules. Automatic Chinese poetry composition by programs is considered as a challenging problem in computational linguistics and requires high Artificial Intelligence assistance, and has not been well addressed. In this paper, we formulate the poetry composition task as an optimization problem based on a generative summarization framework under several constraints. Given the user specified writing intents, the system retrieves candidate terms out of a large poem corpus, and then orders these terms to fit into poetry formats, satisfying tonal and rhythm requirements. The optimization process under constraints is conducted via iterative term substitutions till convergence, and outputs the subset with the highest utility as the generated poem. For experiments, we perform generation on large datasets of 61,960 classic poems from Tang and Song Dynasty of China. A comprehensive evaluation, using both human judgments and ROUGE scores, has demonstrated the effectiveness of our proposed approach.

[1]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[2]  Alex A. Freitas,et al.  Document Clustering and Text Summarization , 2000 .

[3]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  H. Manurung An evolutionary algorithm approach to poetry generation , 2004 .

[6]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[7]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[8]  Hans Uszkoreit,et al.  Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1 , 2008 .

[9]  Xiaojun Wan,et al.  Multi-document summarization using cluster-based link analysis , 2008, SIGIR '08.

[10]  Long Jiang,et al.  Generating Chinese Couplets using a Statistical MT Approach , 2008, COLING.

[11]  Michihiko Minoh,et al.  Hitch Haiku: An Interactive Supporting System for Composing Haiku Poem , 2008, ICEC.

[12]  Patrick F. Reidy An Introduction to Latent Semantic Analysis , 2009 .

[13]  Hugo Gonçalo Oliveira Automatic generation of poetry: an overview , 2009 .

[14]  Ryohei Nakatsu,et al.  New Hitch Haiku: An Interactive Renku Poem Composition Supporting Tool Applied for Sightseeing Navigation System , 2009, ICEC.

[15]  Yoav Goldberg,et al.  Gaiku : Generating Haiku with Word Associations Norms , 2009 .

[16]  Zhou Chang,et al.  Genetic Algorithm and Its Implementation of Automatic Generation of Chinese SONGCI , 2010 .

[17]  Kevin Knight,et al.  Automatic Analysis of Rhythmic Poetry with Applications to Generation and Translation , 2010, EMNLP.

[18]  Thomas J. Misa,et al.  College Of Science And Engineering , 2010 .

[19]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[20]  Yan Zhang,et al.  Timeline Generation through Evolutionary Trans-Temporal Summarization , 2011, EMNLP.

[21]  Hugo Gonçalo Oliveira PoeTryMe : a versatile platform for poetry generation , 2012 .

[22]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[23]  Ruli Manurung,et al.  Using genetic algorithms to create meaningful poetic text , 2012, J. Exp. Theor. Artif. Intell..

[24]  Long Jiang,et al.  Generating Chinese Classical Poems with Statistical Machine Translation Models , 2012, AAAI.

[25]  Mark Sanderson,et al.  Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval , 2012, SIGIR 2012.