Domain Controlled Title Generation with Human Evaluation

We study automatic title generation and present a method for generating domain-controlled titles for scientific articles. A good title allows you to get the attention that your research deserves. A title can be interpreted as a high-compression description of a document containing information on the implemented process. For domain-controlled titles, we used the pre-trained text-to-text transformer model and the additional token technique. Title tokens are sampled from a local distribution (which is a subset of global vocabulary) of the domain-specific vocabulary and not global vocabulary, thereby generating a catchy title and closely linking it to its corresponding abstract. Generated titles looked realistic, convincing, and very close to the ground truth. We have performed automated evaluation using ROUGE metric and human evaluation using five parameters to make a comparison between human and machine-generated titles. The titles produced were considered acceptable with higher metric ratings in contrast to the original titles. Thus we concluded that our research proposes a promising method for domain-controlled title generation.

[1]  Vibhu O. Mittal,et al.  Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries , 1999, SIGIR '99.

[2]  Ashesh Mahidadia,et al.  Extractive Summarisation Based on Keyword Profile and Language Model , 2015, NAACL.

[3]  Franck Dernoncourt,et al.  Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation , 2019, NAACL.

[4]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[5]  Rong Jin,et al.  Title Generation Using a Training Corpus , 2001, CICLing.

[6]  C. Paiva,et al.  Articles with short titles describing the results are cited more often , 2012, Clinics.

[7]  Alexander A. Alemi,et al.  On the Use of ArXiv as a Dataset , 2019, ArXiv.

[8]  Rico Sennrich,et al.  Controlling Politeness in Neural Machine Translation via Side Constraints , 2016, NAACL.

[9]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[10]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[11]  Jie Wang,et al.  DTATG: An Automatic Title Generator based on Dependency Trees , 2016, KDIR.

[12]  Masayu Leylia Khodra,et al.  Automatic Title Generation in Scientific Articles for Authorship Assistance: A Summarization Approach , 2017 .

[13]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[14]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[15]  M. HamidR.Jamali,et al.  Article title type and its relation with the number of downloads and citations , 2011, Scientometrics.

[16]  Jan Wira Gotama Putra,et al.  Title Validity Checker Utilizing Vector Space Model and Topics Model , 2015 .

[17]  Paul E. Green,et al.  Rating Scales and Information Recovery—How Many Scales and Response Categories to Use? , 1970 .

[18]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[19]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[20]  Yan-Ying Chen,et al.  Adversarial Domain Adaptation Using Artificial Titles for Abstractive Title Generation , 2019, ACL.

[21]  R. Lissitz,et al.  Effect of the number of scale points on reliability: A Monte Carlo approach. , 1975 .

[22]  Masayu Leylia Khodra,et al.  Rhetorical Sentence Classification for Automatic Title Generation in Scientific Article , 2017 .

[23]  Josep Maria Crego,et al.  Domain Control for Neural Machine Translation , 2016, RANLP.

[24]  Albert Gatt,et al.  Best practices for the human evaluation of automatically generated text , 2019, INLG.

[25]  Wei Liu,et al.  Multi-lingual Wikipedia Summarization and Title Generation On Low Resource Corpus , 2019 .

[26]  Quoc V. Le,et al.  Semi-supervised Sequence Learning , 2015, NIPS.

[27]  Alexander G. Hauptmann,et al.  Headline Generation using a Training Corpus , 2001 .